[Evolution] The attached email causes evo to consume 50% of dual cpu and all ram until killed



Re-sending, as original doesn't appear to have made it to the list.
Also filing a bug as testing after a git head rebuild does not remedy
it.

https://bugzilla.gnome.org/show_bug.cgi?id=613181

As info -- and in case anyone else wants to validate/invalidate.

Using evo git head built ~10 AM yesterday.  Attached email is downloaded
via fetchmail and placed into an evo maildir account.  Selecting the
email in evo causes evo cpu usage on dual core core 2 to consume 50% of
all available cpu and causes memory usage consumed by evo to start
increasing continuously.  Evo becomes unresponsive (system pretty much
does also).  Status bar states that it is Formatting message (0%
complete)
Result is evo must be killed.

Not sure if it's just pertinent to my setup or not.  I'll flush and
rebuild and see if the issue persists.

reid


--- Begin Message ---
Title: TechCrunch

The Latest from TechCrunch

Link to TechCrunch

Google Makes Exchanging Microsoft Exchange For Google Apps A Bit Easier

Posted: 17 Mar 2010 08:59 AM PDT

There’s no question that Google is setting its sights on taking some of Microsoft’s marketshare in the productivity suite space. Last year, Google announced a new plug-in that syncs Googleâs enterprise versions of Apps, including Gmail, contacts, and calendar, with Microsoftâs Outlook. And Google just acquired Docverse, an application lets users collaborate directly on Microsoft Office documents. Today Google is taking another swipe at Microsoft with a new tool that makes it significantly easier to make the switch over to Google Apps from Microsoft Exchange.

Google Apps Migration for Microsoft Exchange is a new server-side tool that migrates a company’s email, calendar and contact data from Microsoft Exchange, an email server software product from Microsoft, to Google Apps. Google promises ease with the tool, allowing IT administrators the ability to select the mail, calendar and contact data to move in phases and migrate hundreds of users at the same time. Plus, employees can use Exchange during the migration without any interruption. The tool works with Exchange 2033 and 2007 for both on-premise and hosted applications and is available to the enterprise and education versions of Google Apps.

This is clearly a play at showing businesses how simple it is to move from from Microsoft products, such as Exchange, that may not be hosted in the cloud to the cloud-based Google Apps product. Google product Manager Matt Glotzbach told me that the search giant wants to make it as simple as possible for potential customers to make the switch to Google Apps, and many potential Google Apps’ clients are using Microsoft Exchange to host and power email, calendar, and contacts. Google also launched Google Apps Migrator for Lotus Notes and a Connector for BlackBerry Enterprise Server

Google Apps has steadily been growing; already 25 million people are using the Apps product. And that also includes over 2 million businesses ranging from startups, to small businesses, to Fortune 500 companies. And Google is developing a compelling ecosystem around Google Apps, recently launching the Google Apps Marketplace, which is an an app store for enterprise apps in the cloud.



Mobile Social Network MocoSpace Now 11 Million Members Strong

Posted: 17 Mar 2010 08:58 AM PDT

Mobile social network MocoSpace now has a count of 11 million members, with 500,000 members forming new friendships every day on MocoSpace. The startup’s mobile only social network targets users who have non-smartphones that have simpler interfaces.

MocoSpace, which launched in 2006, makes money with its virtual currency and through advertising and mainly reaches the 18 to 34 age demographic. The site claims to generate 3 billion pages per month, with users mobile users accessing the site over 5 times per day on average. The site is also generating interest from musicians using the site to share their music, with over 200 artists submitting music on MocoSpace every day. Though not nearly as popular as Facebook or MySpace. MocoSpace is now one of the largest mobile-only social networks.

The startup prides itself on its users mainly being non-techies who don’t own an iPhone, Android or BlackBerry device. MocoSpace also claims to have a diverse user base; 1/3 of their user base is Hispanic and 1/3 is African-American. We recently published some surprising stats about the breakup habits of MocoSpace users.

Â



Complaints Against Yelp’s “Extortion” Practices Grow Louder

Posted: 17 Mar 2010 08:01 AM PDT

Yelp has been hit with another lawsuit, the third in a matter of a few weeks. Similar to the previous complaints, this lawsuit filed by Boris Levitt, the owner of Renaissance Furniture Restoration in San Francisco, claims that Yelp’s “unfair and unethical conduct in promoting, marketing and advertising its website as maintaining unbiased reviews” is unlawful and hurt his business. Levitt’s suit is similar to the previous claims that Yelp is extorting businesses for advertising. We’ve embedded the complaint below.

The business claims that after declining a request to purchase advertising on Yelp, a number of positive reviews from his business’ listing on the reviews site mysteriously disappeared, downgrading the company’s rating on the site. Levitt claims that ten out of eleven five star reviews were removed from his company’s page following his decision not to purchase advertising on Yelp.

Two weeks ago, the company got slapped by a lawsuit from from the Dâames Day Spa of San Diego County, accusing Yelp of removing many positive reviews because the spa declined to run ads on the site. And the previous week, two law firms, Beck & Lee from Miami and The Weston Firm in San Diego, filed a class action lawsuit in Los Angeles federal court alleging unfair business practices by local business review and rating website operator Yelp.

The lawsuit alleged that the heavily funded startup runs an “extortion scheme” and has “unscrupulous sales practices” in place to generate revenue, in which the companyâs employees call businesses demanding monthly payments in the guise of advertising contracts, in exchange for removing or modifying negative reviews.

Additionally, today, nine small businesses from across the United States have joined the Beck & Lee and Weston suit, including The Bleeding Heart Bakery in Chicago; Bleeding Heart Bakery of Chicago, Illinois; Scion Restaurant of Washington, D.C.; J.L. Ferri Entertainment, Inc. of New York, New York; Sofa Outlet of San Mateo, California; CelibrÃ, Inc. of Torrance, California; Astro Appliance Service of San Carlos, California; Wag My Tail, Inc., of Tujunga, California; Le Petite Retreat, of Los Angeles, California and Mermaids Cruise of San Francisco, California

Last year, the East Bay Express ran an explosive story, basically accusing Yelp of being in the ‘Business of Extortion 2.0′, which covered similar ground. Shortly after reporter Kathleen Richards published the article, Yelp vehemently denied everything and called her piece inaccurate.

Yelp CEO and co-founder Jeremy Stoppelman has explicitly denied that they ever offered preferential treatment in exchange for money.


Boris Levitt Vs. Yelp


URL Shorteners Slow Down The Web – Especially Facebook’s FB.me

Posted: 17 Mar 2010 08:01 AM PDT

It’s hard to imagine a Web sans URL shortening services nowadays but you can rest assured that they’re here to stay – for better or worse. Question is: how do the likes of bit.ly, TinyURL and Goo.gl score in terms of speed and availability?

That’s exactly what Dutch startup WatchMouse sought to find out, by monitoring the performance and uptime for 14 popular URL shorteners for a whole month.

Turns out most really don’t perform all that well, and that URL shorteners actually increase the load time of pages significantly. As you can tell from the graph embedded above, a lot of URL shortening services add half to nearly a full second to page load times.

To measure this, WatchMouse checked each URL shortener every five minutes from one of its monitoring stations, which are located across the globe. For each short URL, only the redirection was measured, not the actual loading of the target page.

Pingdom did similar research on the speed and reliability of URL shortening services in August 2009, although they only looked at independent URL shorteners and not the ones from Microsoft, Facebook and Google.

Google does a pretty good job in terms of performance with Goo.gl and YouTu.be, but it still takes those about 1/3 of a second to resolve pages, which makes a world of difference if you think about how many website addresses get shortened on a daily basis.

According to WatchMouse’s findings, Facebookâs FB.me is by far the slowest of the pack, adding over two seconds on average to the page load time after the click on a link.

Another interesting thing the company noticed is that only a few of the URL shorteners optimized their name servers for international use â i.e. it takes half a second for some of the URL shorteners just to look up the IP address that is needed for a browser to retrieve a Web page.

As for the availability of the URL shortening services: most do reasonably well in this regard, with snurl.com performing worst of the bunch with south of 98% uptime. Facebook’s fb.me registered the third worst uptime out of the 14 services that were tracked, although that still means about 99.5% availability – which isn’t terrible.

Now all they have to do is speed it up a little.



AOL Partners With Celebrity Chefs To Launch Recipe And Foodie Site KitchenDaily

Posted: 17 Mar 2010 07:10 AM PDT

AOL has recruited a few celebrity chefs and foodies; including Curtis Stone, Food & Wine’s Gail Simmons, and Marcus Samuelsson; and the famed Culinary Institute of America to launch food website KitchenDaily. Similar to Epicurious, AllRecipes, or FoodNetwork.com, KitchenDaily features a recipe database of meals that have been tested by top chefs, food magazine and cookbook publishers.

AOL has recruited a number of talent from Conde Nast’s recently shuttered Gourmet Magazine. Former Gourmet Editor Cheryl Brown is the editor-in-chief with a number of former Gourmet writers named as contributors. Epicurious’ Megan Steintrager will be the senior editor of the site.

In addition to recipes, KitchenDaily will feature a meal-planner tool, cooking lessons, cookbook reviews, and and more than 250 instructional videos created by celebrity chefs and industry insiders. AOL already has a food blog called Slashfood though its unclear how the two sites will be integrated.

The recipe site space is crowded, with a number of worthy competitors vying for traffic. And even Microsoft’s Bing launched its own recipe search product. But its seems that KitchenDaily aims to be part magazine, part recipe site, so it may be able to differentiate itself and possibly attract many former Gourmet readers. We know that AOL CEO Tim Armstrong is bullish on niche content so the launch of this site fits into the company’s strategy nicely.



Former Yahoo Execs Launch nPario To Help Companies Understand Consumers

Posted: 17 Mar 2010 07:10 AM PDT

A new startup dubbed nPario and formed by ex-Yahoo and SAS executives opened its business operations today. The company essentially wants to help clients better understand and market consumer commercial intent through optimal data management and data mining products and services.

The company is led by former senior Yahoo executives Bassel Ojjeh (he left the Sunnyvale company in November 2009) and Krishna Uppala, and former SAS executive Basel Tutunji. According to a regulatory filing, the Palo Alto startup recently raised $300,000 in seed funding.

The company’s website is pretty scarce on details, but according to the release nPario will deliver data solutions that allow companies to increase their revenue by acting upon consumer behavior insights.

In the words of Ojjeh, founder, president and CEO of nPario:

âThe digital world gives us an unprecedented opportunity to identify and understand the commercial intent of consumers in order to deliver the right message or product. At nPario, we believe that organizations stand to boost revenue by more than 10% if they harness the power of consumer intent.

Our goal is to provide our customers with a comprehensive set of data products that focus on the vast amount of commercial behavior data and generate immediate impact to their business and revenue.â

Prior to nPario, Ojjeh served as Senior Vice President of the Strategic Data Solutions division at Yahoo, where he was responsible for building data products that leveraged Yahoo data to drive audience engagement and advertising revenues, so it’s safe to say he knows what he’s talking about. Still according to the release, CTO Krishna Uppala is behind more than 15 database technology patents – he served as Senior Director/Architect at Yahoo before nPario.

Finally, nPario Chief Revenue Officer Basel Tutunji will be responsible for sales and business development for the startup. Before nPario, Tutunji held sales management roles at several multinationals including SAS, Intershop Communications and Oracle.



Mac Pricing Leak? Perhaps Updated Macs Are Inbound.

Posted: 17 Mar 2010 06:47 AM PDT

In case you haven't noticed, the MacBook Pro line is starting to get a little stale with just a lowly Core 2 Duo CPU. Even the Mac Pro with it's Quad-Core Xeon isn't the fastest kid in town anymore with the six-core Core i7-980X making the rounds. Hopefully all this fuss concerning a supposed leak of new Mac pricing that's a bit higher than the current MSRPs foreshadows updates coming in the near future.


uTest Finds 908 Bugs In Web And Mobile Apps Of Major U.S. TV Networks

Posted: 17 Mar 2010 06:17 AM PDT

Software testing marketplace uTest today announced the results of its so-called “TV Networks Bug Battle” competition. More than 500 software professionals from 30 countries around the world participated in the quarterly competition, reporting a total of 908 technical, functional and GUI bugs in the web and mobile apps of NBC, CBS, Fox and ABC.

Testers were challenged to search the sites for bugs â performing a combination of exploratory, functional and usability testing. At its conclusion, participants filled out a detailed survey in which they ranked each site based on video quality, ease of use, community features and actual TV content/shows. After carefully reviewing each bug and survey response, uTest awarded roughly $4,000 in prize money based on the quality of bugs and feedback.

Top findings:

- Nearly 50% of survey respondents chose video quality as the attribute most important to them when evaluating an online TV network. NBC.com scored highest in video quality.
- Ease of use was deemed most important by 33% of the participants, followed by TV content & shows (12%) and community features (5%). CBS.com scored highest in ease of use.
- 70% of respondents watch at least one show online each week, with more than one quarter watching four or more. 7% watch seven or more programs.
- More than 10% of the total reported bugs were found on mobile devices.
- None of the TV network support mobile video watching as they rely on Flash (tested on Blackberry, iPhone, Android and other mobile platforms).
- Cross-site scripting (XSS) vulnerabilities of varying degrees of severity were reported on three out of the four sites.

TV Network comparison (top-two box score of testers who rated each site as âexcellentâ or âgoodâ):

Here’s the full report:



Advertising Expenditures Dropped 12.3% In 2009, But Digital Grew 7.3%

Posted: 17 Mar 2010 05:20 AM PDT

Total advertising expenditures fell 12.3% last year to $125.3 billion as compared to 2008, according to data released today by Kantar Media. However, Q4 2009 ad spending was off 6% against the year ago period, with nearly all media improving upon their January-September performance.

Zooming in on the digital part of equation, Kantar Media says Internet display ad expenditures actually increased 7.3 percent in 2009, aided by higher spending from the telecom, factory auto and travel categories.

Print media were unsurprisingly hit hard, with measured ad spending in the Newspaper sector plunging by 19.7% in 2009.

You can find more figures and insights in the press release.



GetJar: Mobile App Sales Will Overtake CD Sales By 2012 (Video + Slides)

Posted: 17 Mar 2010 04:29 AM PDT

An independent study released this morning by neutral app store GetJar indicates that the market for mobile apps should grow to a whopping $17.5 billion within the next three years.

This would basically mean that the value of apps sold would be greater than the value of CDs sold in 2012 ($13.83 billion).

According to the same study, downloads of mobile apps to handsets will leap from slightly more than seven billion in 2009 to nearly 50 billion in 2012, representing a YOY growth of 92%.

The figures are pretty much in line with other forecasts, such as research2guidance’s prediction that the worldwide smartphone application market will grow from $1.94 billion in 2009 to $15.65 billion by 2013.

GetJar had commissioned independent consulting firm Chetan Sharma Consulting to look into the global mobile apps market, in order to analyze the potential and real value of the mobile apps market worldwide, using first-hand data.

According to the study, by 2012, off-deck paid-for apps will be the biggest revenue generator, accounting for almost 50 per cent of all apps revenue. By comparison, in 2009, on-deck apps available from mobile operators accounted for over 60% of all apps revenue, but this will fall significantly to just under 23% by 2012.

The average app selling price for apps in North America was $1.09, significantly higher compared to that in developing markets such as South America ($0.20) and Asia ($0.10).

According to the study, revenue opportunities in Europe are set to soar from $1.5 billion in 2009 to $8.5 billion in 2012, while in North America the figure will rise from around $2.1 billion to around $6.7 billion in 2012

Currently, apps are most popular in Asia, with the region accounting for 37% of global downloads in 2009. However, while Asia had the highest number of downloads, users in North America spent the most money on apps, accounting for over 50% of revenue.

GetJar CEO Ilja Laurs first presented the results of the study at my conference, Plugg, last week. The full video is embedded above, or you can jump straight to the Vimeo page.

As for the presentation slides:



PayPal Wants To Go From 1000 To 2000 Employees In Asia – This Year

Posted: 17 Mar 2010 02:43 AM PDT

PayPal has seen the future, and apparently it lies out East. The eBay company has just announced plans to double its presence in the Asian-Pacific region by the end of 2010, and made a couple of other, separate announcements to underscore its focus on Asia.

At PayPal’s new international headquarters in Suntec City, Singapore’s technology hub in the middle of the nation’s central business district, the company said that it plans to double the number of employees in Asia Pacific from 1,000 currently to more than 2,000 by the end of the year.

The company plans to add more than 100 new jobs at its international headquarters in Singapore alone, as it represents all of the company’s business outside of the United States.

New jobs will be located at all seven offices in the region including Australia, China, Hong Kong, India, Japan, Singapore and Taiwan. For its Singapore business headquarters and development center, PayPal will be recruiting Singapore-based professionals with expertise in technology, product development, infrastructure design, risk and engineering.

PayPal says it has processed more than $6 billion of total payment volume (at spot rate) in Asia Pacific in 2009, an increase of 38 percent from 2008. Since its establishment in the region in 2006, the company has struck dozens of partnerships with Asian companies including this morning’s announcements today with DBS, Singapore’s largest bank, and China UnionPay, China’s bank card association (more about the latter deal over at BusinessWeek)

As part of PayPal’s plans to help grow the e-commerce ecosystem across Asia Pacific, the company also announced that the PayPal mobile payment software development kit (SDK) will be made available to developers in the region. That way, developers can add a checkout button to accept mobile payments without the need to collect financial information from customers with just a few lines of code.

The mobile SDK, which will initially support iPhone app development, will be available in the second quarter of 2010 to developers in the region.



AT&T Emails SF Customers That The Network Is Getting Better (Did The Call Fail?)

Posted: 16 Mar 2010 06:09 PM PDT

When bashing AT&T’s network, two cities usually come up above all others: New York City and San Francisco. AT&T has even acknowledged just how bad it is in those cities. But they’ve also said for a while that they’re working on making it better. And apparently now that work is far enough along that they’re emailing customers about it.

Over the past week, AT&T has been emailing its customers in San Francisco to let them know that the network is getting better. “We wanted you to be among the first to know! We recently enhanced the 3G network in the greater San Francisco area to provide better in-building 3G coverage, fewer dropped calls and a better overall wireless experience,” says the email.

It continues, “With better coverage on the nation’s fastest 3G network, there’s never been a better time to be an AT&T customer.” In other words, “things are getting better, please don’t leave.”

Hopefully, it’s a good sign if AT&T feels comfortable enough about its network to email its customers. Another good sign: AT&T did a great job this past week keeping its network stable during the SXSW festival (after failing badly last year). Still, AT&T has been saying for a while that they’re working on fixing the network in San Francisco, and on any given day it can be as bad as ever.

Below, find the full email being sent:

We wanted you to be among the first to know! We recently enhanced the 3G networkÂin the greater San Francisco area to provide better in-building 3G coverage, fewer dropped calls and a better overall wireless experience.

The great news is coverage inÂNorthern California will continue to improve as we expand capacity, optimize and add more sites in the coming months.

With better coverage on the nation’s fastest 3G network, there’s never been a better time to be an AT&T customer.

We thank you for your continued loyalty and look forward to sharing more good news soon.



Big Data Is Less About Size, And More About Freedom

Posted: 16 Mar 2010 05:33 PM PDT

Big Data Graphic

Editor’s note: Big Data has been around for a long time between credit card transactions, phone call records and financial markets. Companies like AT&T, Visa, Bank of America, Ebay, Google, Amazon and more have massive databases they mine for competitive advantage. But lately, Big Data is finding its way to the smallest startups. The Web and cloud computing brings Big Data everywhere. But what exactly is pushing Big Data forward?

To answer that we brought in an expert, Bradford Cross. Bradford is the Co-Founder and Head of Research at FlightCaster. FlightCaster is backed by Y Combinator, Tandem Entrepreneurs and Sherpalo Ventures. The company analyzes large data sets to predict flight delays. Bradford is chair of the Dealing with Big Data track at Cloud Connect this week.

We are in a Renaissance for computer science, engineering, and learning from data right now. The scale of data and computations is an important issue, but the data age is less about the raw size of your data, and more about the cool stuff you can do with it. Now that there is so much data, it is time to unlock its value. Really neat things are happening alreadyâlike the way the people of the world can educate themselves on all manner of issues and topics, or the way data and computing serves as leverage in other scientific and technical endeavors. There will be lots of amazing stuff on the web, but innovation will come in other domains as well.

The recent big data trend is about the democratization of large data more than its growth. In articles like the Economist’s recent piece on the data deluge, we hear about big data everywhere. We hear about what big data and the cloud mean for the enterprise, but they have had big data for a long time. eBay manages petabytes in its Teradata and Greenplum data warehouses. Sophisticated startups extracting value from big data is also nothing newâit has been happening at least since the days of Yahoo! and Google, and they have done it without the data warehousing folks.

Now focused early stage startups can get up and running faster than ever Less technical analysts at companies like Facebook and Twitter can access massive amounts of data easily. Even individuals can undertake cool projects with big data, such as Pete Skomoroch of Data Wrangling did with trending topics for Wikipedia.

Why Now?

We do not have to build all our own hardware and software infrastructure anymore.

Pioneers such as Amazon have given us the cloud, where we have the capability to run very large server clusters at a low startup cost. Pioneers like Google have paved the way for open source projects like Hadoop and HBase, that are backed by big company contributors like Facebook.

Aardvark Logo

The combination has paved the way for a new class of data driven startup like Aardvark (just acquired by Google) and Factual, it has reduced both cost and time to market for these startups, as we showed with Flightcaster. And, it has allowed startups that were not necessarily data driven to become more analytical as they evolved, such as Facebook, LinkedIn, Twitter, and many others.

So we have big data, the cloud, and open source facilitating new data-driven startups. I like to break this trend down from the technical perspective into three chunks; storing data, processing data, and learning from data. I define “learning from data” to mean data mining, AI, machine learning, statistics, and so on.

Supersize my data. Oh wait, I’ll just have a Medium.

Cloudera Logo

The first time I heard the “Medium Data” idea was from Christophe Bisciglia and Todd Lipcon at Cloudera. I think the concept is great. Companies do not have to be at Google scale to have data issues. Scalability issues occur with less than a terabyte of data. If a company works with relational databases and SQL, they can drown in complex data transformations and calculations that do not fit naturally into sequences of set operations. In that sense, the “big data” mantra is misguided at times. For instance, a GigaOm article about big data in the cloud states:

What is becoming increasingly clear is that Big Data is the future of IT. To that end, tackling Big Data will determine the winners and losers in the next wave of cloud computing innovation.

The big issue is not that everyone will suddenly operate at petabyte scale; a lot of folks do not have that much data.

The more important topics are the specifics of the storage and processing infrastructure and what approaches best suit each problem. How much data do you have and what are you trying to do with it? Do you need to do offline batch processing of huge amounts of data to compute statistics? Do you need all your data available online to back queries from a web application or a service API?

Once your data and its processing are large enough to require distributing the data and the work among machines across network boundaries, things get a lot harder. You have to deal with distributed computing and make tradeoffs like a real computer scientist.

Big Data & The Cloud: Viral Buzzwords 4.0!

The cloud, and hosted services, present very interesting opportunities. One of the greatest is that people can leverage the a la carte economics of elastic computing to do things that were prohibitively expensive due to the requirements of building and maintaining their own hardware infrastructure. The interesting parts about the current cloud are its lack of entrance friction and elastic cost efficiency, the speed with which new entrants can set up, and the elastic capability to run 100 machine clusters for 1 hour if that is what is needed.

We started Flightcaster almost a year ago, and it is a good example of how startups can leverage cloud compute and storage resources, mix some open source like Hadoop with some data mining, and create interesting new technologies with relatively low capital upfront.

The cloud is not cheaper in general. Once people scale to a certain point, they move off the cloud onto dedicated hardwareânot the other way around. That may change, and better hosted services may play a role in the transition, but that will take a while. In the meantime, the interesting part of the cloud is the use of elastic resources and the ability to get up and going quickly. The interesting part is the freedom it gives startups to try things they would never otherwise do.

Another notable thing about the cloud is the new architectures emerging as a result of economic and resource tradeoffs.

Amazon Web Services Logo

Storage of large amounts of data in the cloud is much cheaper with blobstores like Amazon S3 than it is to maintain an always-up cluster for a distributed datastore. If you do mostly offline batch processing and you do not need bulk storage to be online, then it is an attractive setup.

Storage and NoSQL

Taking another glimpse from the future of big data in the cloud.

A Big Data stack…will also need to emerge before cloud computing will be broadly embraced by the enterprise. In many ways, this cloud stack has already been implemented, albeit in primitive form, at large-scale Internet data centers, which quickly encountered the scaling limitations of traditional SQL databases as the volume of data exploded. Instead, high-performance, scalable/distributed, object-orientated data stores are being developed internally and implemented at scale…large web properties have been building their own so-called âNoSQLâ databases, also known as distributed, non-relational database systems (DNRDBMS).

There are several misguided points here. First, there is not going to be a big data or cloud stack. Distributed systems are about making trade offs and a move toward problem-specific solutions rather than one-size-fits-all stacks. Second, enterprises already have their solutionâexpensive data warehousing and consulting support. Will open source projects like Hadoop supported by people like Cloudera take a chunk of the business? Sure. But as I mentioned earlier, the most interesting part about big data and the cloud is not cheaper alternatives for the enterprise, it is the opportunities it facilitates for data-driven startups.

There is a lot of talk about the NoSQL movement. The big idea here is that distributed systems are hard, require tradeoffs, and sometimes we are better off with data storage and processing that are specific to what we are doing with the data. Sometimes even with a small amount of data on a single node, there are better alternatives to SQL queries and relational databasesâtime series data has long been a good example.

Processing and Hadoop: The Elephant In The Room

Haddop Elephant Logo

There is a broad range of needs for processing large amounts of data. These range from simple needs like calculations for log analysis that just need to occur at scale, to middle of the road needs like BI, to complex needs like scalable modern machine learning and retrieval systems.

There are a different approaches one can use to service specific needs. Again, we see the pattern of moving away from one-size-fits-all stacks, and toward building for your needs. That said, there are very generic abstractions like Map-Reduce that work well for a lot of use cases. Distributed systems are hard to get right, so when something like Hadoop gets a lot of momentum, it retains that momentum until alternatives have the time to mature enough to solve the hard problems with fault tolerance, performance, and so forth. Not everyone is Leonardo da Vinci, so people should not attempt to create these systems on their own unless they really know what they are doing. In that sense, the cloud and big data are facilitators of open source.

Hive Elephant Bee ImagePig Logo
An important aspect of processing at scale is abstraction. Writing complex or even simple computations in raw Map-Reduce is verbose for programmers and intimidating for others who might want to play with the data. Abstractions over Map-Reduce like Pig and Hive make simple things easy, and abstractions like Cascading make hard things possible. The Map-Reduce paradigm, and Hadoop in particular, have been a big success. That said, Map-Reduce is not the only important piece of compute infrastructure. Message queues serve as the backbone of a lot of compute architectures – implementations of AMQP, such as rabbitmq, are a prime example. You can accomplish a lot with producers, consumers, and a messaging system. Distributed storage and processing systems can also be very tricky to configure and deploy, requiring a pretty deep understanding of the system – hence the business case for folks like Cloudera.

Learning from Big Data

Hal Varian, Googleâs Chief Economist, recently said,
Hal Varian Picture

The sexy job in the next ten years will be statisticiansâ The ability to take dataâto be able to understand it, to process it, to extract value from it, to visualize it, to communicate it

Unfortunately for those of us working on these problems in real life, it is not so simple. The archetypal data-renaissance man is mathematician, statistician, computer scientist, machine learner, and engineer all rolled into one. There are opportunities where you can lack some of these skills and work with a team that supplements your weak pointsâa startup is not one of those.

Now that we can store so much data, it is attractive to do previously unimaginable things with it. We are sure to see cool applications in fields from the internet to biotechnology to nanotechnology and fundamental materials science research. Almost all advances in every field of science and technology are now heavily dependent upon data and computing. Machine learning is serving a fantastic role as a bridge between mathematical and statistical models and the worlds of AI, computer science, and software engineering. We are exploring applications in learning from text, social networks, data from scientific experiments, and any other data sources we can get our hands on.

The data renaissance does present some difficult issues. There are not many places one can recieve a good education on working on these problems at large scale. Scaling our modeling and optimization algorithms is hard. We need to figure out how to partition and parallelize, or sometimes trade speed and scale for approximately correct calculations. Another issue is that we are often using simplistic models, albeit with pretty good results in many cases. We would like to move toward a deeper approximation of real intelligence.

But the data renaissance is here. Be a part of it.



Social Gaming Startup MetroGames Gets A $5 Million Infusion From Playdom

Posted: 16 Mar 2010 04:50 PM PDT

Argentinian social gaming company MetroGames, has just raised $5 million in series A funding from game developer Playdom. According to a release, the investment will be used to expand MetroGames’ development of games and its social gaming platform. Playdom’s CEO, John Pleasants will join MetroGames’ board.

MetroGames has over 30 games on both Facebook and its own standalone social gaming site. In the release, Pleasants said that he believes that the company will become a “big player in the social gaming market.”

It’s no secret that Playdom is eying the Facebook gaming market that Zynga dominates. The social gaming company just bought Facebook game developer Offbeat Creations. In November, Playdom raised a massive $43 million at a $260 million valuation. As we reported at that time, Playdom’s presence on MySpace was strong. Their Mob Wars game has 14 million or so users there, and the company was likely pulling in $60 million or more in revenue at that time. According to our stats from November, Playdom had 28 million monthly game users with 60% of traffic is from MySpace v. 40% from Facebook.



Xobni’s BlackBerry App Is Just An Excuse To Sync Your Contacts Through Xobni One

Posted: 16 Mar 2010 03:48 PM PDT

It took almost a year, but Xobni finally released its email app for the Blackberry. It works as a standalone app integrated with the email on your Blackberry, but similar to Xobni’s Outlook plugin, it ranks your contacts by importance and pulls in social data from Facebook, LinkedIn and other places.

Along with the Blackberry app, Xobni is introducing another product which may turn out to be more important in the long run. It is called Xobni One, and it syncs your Xobni contacts in Outlook with your contacts on your Blackberry, all in the cloud. As Xobni rolls out more apps in the future, Xobni One should be able to sync contacts across those as well (very Mesh-like).

Xobni One is a way to sync your desktop and mobile contacts. If you use Outlook on your desktop at work, but Gmail on your Blackberry, Xobni One reconciles the two. And when you leave your job, your contacts stay with you. Xobni One isn’t free. It costs $4 a month or $40 a year, bundled with the Blackberry app. Keeping your contacts in sync is expensive. Doesn’t it seem that Google or Microsoft will eventually just do this for free?



CODE Advisors Absolutely, Definitely Not Working With MySpace On A Spinoff

Posted: 16 Mar 2010 01:08 PM PDT

Lots of scuttlebutt around Silicon Valley that new investment bank CODE Advisors is out pitching a MySpace spinoff to potential buyers and investors. Sources include people who’ve actually been pitched.

CODE Advisor partner Quincy Smith says “We have not been engaged by News Corp. or MySpace on a sale of the company.” MySpace also contacted us to deny the rumor – “The story is false.” – although we hadn’t actually gotten around to asking them yet. Word travels fast, it seems.

MySpace does confirm that they have an ongoing relationship with CODE Advisors to look for companies that they may want to buy, particularly in the music space (they’ve bought two music startups, iMeem and iLike, in the last year). CODE Advisors partner Fred Davis is leading that effort.

But any effort to spin off MySpace from News Corp. – something we’ve argued must be done for the company to have any chance to thrive – is being done unofficially. And perhaps without the knowledge of News Corp. execs.

Are MySpace execs testing the waters to see if there’s a way to spin themselves off of the politics-driven News Corp.? That’s being flatly denied. But it sure would makes a lot of sense. And, like we said, the pitches are happening, whether everyone denies it or not.



Spotify Consumes More Internet Capacity Than All Of Sweden

Posted: 16 Mar 2010 01:04 PM PDT

Today, during his keynote address at the SXSW festival in Austin, Texas, Spotify CEO Daniel Ek had a big revelation: “On certain days, we’re consuming more Internet capacity than Sweden has as a country.”

Ek made the statement when asked why Spotify chose to use a P2P model, rather than centrally store all of its music in one place and stream it from there. Ek noted that if they were to stream from one UK datacenter, they’d consume all the bandwidth. So instead, they leverage the power of the Internet to get their users to help them stream to other users.

Ek also said this was primarily the reason that Spotify is a native application, rather than a web app. P2P streaming is a bit more complicated than streaming from one source on the backend of things, obviously.

When asked why Apple (which of course, runs the largest music store in the world, iTunes) doesn’t use the P2P method, Ek said that was the “million dollar question.” He then speculated that they will move more towards Spotify in terms of being in the cloud (something we’ve written about a few times), and having a subscription model.

Ek noted that Spotify is now in six countries and has over 320,000 paid subscribers. That’s up from 260,000 the last time they mentioned it. Overall, they have some 7 million users now. And yes, that’s largely without the U.S. where the service only exists in a very limited closed beta as the company negotiates with the labels for music rights.



Google Automates The Creation Of YouTube Overlay Ads

Posted: 16 Mar 2010 12:36 PM PDT

In its relentless push to turn YouTube into a profit center, Google is trying anything it can to pump more advertising into the billions of videos people watch on the site. Now it is automating the way that Flash overlay ads can be created and displayed on YouTube videos. Through the self-serve Display Ad Builder in Google AdWords, mom-and-pop businesses can now create Flash overlay ads as easily as they can create display banner ads and place them in YouTube videos.

Overlay ads have been around for a long time on YouTube and other video networks. YouTube constantly refines the types of overlay ads it shows, but many of the small businesses which typically advertise on Google AdWords don’t have the tools to create Flash overlay ads. Now Google is providing them with templates, much like it does already for banner ads.

As of last October, YouTube was showing ads on more than one billion videos a week, which was roughly one in seven videos. YouTube wants to open up all of its video inventory to advertisers large and small. Today’s release is the latest move in that direction.

At what point will there be too many ads and will consumers ever backlash? Already I find those persistent pop-ups and overlays to get in the way of the videos I am trying to watch, and I don’t find them particularly relevant. Flooding YouTube with even more of these ads may be good for its bottom line, but viewers are not going to like them.



Live Blog: Spotify CEO Daniel Ek Says Music Service Now Has 320,000 Paid Subscribers

Posted: 16 Mar 2010 12:05 PM PDT

I’m here at the last keynote of SXSW, where Spotify CEO Daniel Ek is being interviewed by Wired’s Eliot Van Buskirk. Ek will likely be revealing some new announcements about Spotify during this interview. I’ll be live blogging my notes below.

Van Buskirk kicked off the keynote by asking how many people in the audience had used Spotify, leading a significant portion of the audience to raise their hands. This was surprising, because Spotify is only widely available in Europe (you need a beta invite to use it in the US). Ek then took some time to walk the audience through the streaming music service if they haven’t used it before (see our extensive past coverage if you need a refresher).

Q: What drove the initial decision to make this an application as opposed to something in the browser?
A: There are a few things that applications are better for. In our case, we think that applications are better for swift music playback. What we see is that people tend to spend a lot of time on Spotify because it’s so swift. They tend to replace their media player with Spotify, because they notice no difference between playing a song locally (some have even remarked that it’s faster than playing it through iTunes).

Q: Let’s talk about the licensing realities. Spotify is available in Europe. How will the model work in America?
A:There could be slight changes. A year and a half since launch more than 7 users, only in six countries. What we’re working on is the next gen of Spotify. We’ll never be content to just have an app. There are a lot of things we want to fix in Spotify. We tend not to take the ‘release early, often’ approach. What we’ve been working on for last 6-8 months is next gen of Spotify. How to make it more connected. Easier sharing and management of music. We’ve realized people spend a lot of time on Spotify and they tend to manage their music with Spotify.

Q: Which platforms/devices are most exciting?
A: Three years ago if you wanted to develop for mobile, had to support 3-5 major mobile os’s. Long lead times. That shut out all this innovation More recently, application devs can get the application on phones. We look a lot at bundling with devices. Mostly not for revenue possibility but more for pre-installs. With exception of the iPhone today, most of the other handset manufacturers lack a good media player. Historically hard to get music to other phones if you had in iTunes.

Q: Let’s talk about the business side of bundling. If someone is paying for cell phone bill, they can check off something to get Spotify, seems like easier decision. How has that been going in Europe?
A: We have two mobile operators working with us many more to come. If you go into any Telius store in Sweden, you can go in and pick out a smart phone that comes preinstalled with Spotify. 3-6 months included. Incredible takeup with that. One of the key things Spotify is pushing is that people listen/share to more music than ever, more diverse artists. People will still buy music they love, but vast majority of music they just want access.

Q: We’ve heard services like Spotify people say “oh no we’re not going to buy music any more”. The idea of geting people to play a monthly fee, that seems promising. Why would someone buy something?
A: I think we’re going that route. But we find that music I really love, I tend to want to buy it. Not necessarily a plastic disk, but a special edition for an artist I really like, I’m more than happy to pay $100 for a box set with a t-shirt in it, liner notes. Another person may be willing to pay for a live edition with extended tracks. Or pay for a live concert experience. The reality of the music industry today is that there isn’t one biz model. It’s about figuring out how to use downloads, streaming, promotion, ticketing, all these things. I don’t think streaming music is stream.. with Spotify people label us ‘free’ music. But people pay, either with time (adverts, which are targeting), or actually paying for the service.

Q: Are you going to start filtering ads by mood (e.g. if you listen to down tempo music).
A: We want to figure out a lot of things based on how people listen to music. Can figure out mood, brand preferences. We see that from CTRs, if you listen to same music and are from the same place who tends to like a certain brand, there’s a high likihood you will too. Ad model is getting better every month. But this for me is not about free vs paid music, it’s about a model where there’s a free music element and a paid one.

A: Tech savviness at labels is increasing, now more people that love music and know the digital space are working with labels and artists.

Q: How do indy artists get music on Spotify? On ITunes you can submit paperwork. You’re different in that approach.
A: The way to get on Spotify today is we have a bunch of aggregators we work with. Main reason we’ve wanted to work with aggregators is that they tend to understand format/structure. We get quality control, picture, bio, etc.

Q: Are we done with DRM?
A: If you look at Spotify, it has DRM associated with it. We want to make it so that there isn’t really any announcement what’s DRM or not, we can protect and give users flexibility you want.

Q: Let’s talk about Spotify of the future. How do we get to point of ‘music like water’.
A: I see that’s sort of where we’re heading. The music industry needs that happen. I think music and tech are aligned for the first time. We’ve had a lot of proprietary standards, trying to figure out how to get music on a BlackBerry phone vs. getting it on iPhone vs set top box, radically different. We need to open platforms.

Q: With regard to Twitter/FB. Are you thinking of integrating sharing functionality into Spotify?
A: We’re looking at integrating some social aspects. I think genres are non-sane. What classifies rock, or neo-pop, etc. Spotify is quickly approaching 10 mil tracks. How do you manage that? Search is one solution, but isn’t optimal way of discovering new content. We won’t be another social network. We never believed in being our own social network, we’re working with existing social networks.

Q: With your playlists people have read/write access, can delete entire thing, what are you doing about that?
A: Looking from tech angle. We support version updates. One way to solve that is that you can step back in history and go back. What we don’t have is user privilege on playlists. We think Twitter/FB will figure out those privileges, and will use them.

A: I think the total rev matters more than actual conversion rate. But we do want to make sure there are a number people are paying for Spotify and that will grow. We’re making a lot of progress. We’re in six countries, now well in excess of 320,000 paid subscribers. Last time we mentioned a fig. it was 260,000. 100 million playlists. 7 million users. People spend a lot of time on playlists. 30% of all playlists are albums (albums stored in collection). People say album is dead. I don’t agree. I think there’s a lot to develop there.

Q: Let’s talk about P2P element.
A: It was a key decision, and one reason we’re a native app. Helps offload bandwidth. P2P actually helps Spotify and users, it will take tracks on your friends and coworkers on same local network and stream to them so it’s faster. “We’re consuming more capacity than Sweden has as a country”. If we had to stream all the data from our UK center, we’d consume all the bandwidth.

Q: Why isn’t Apple doing this?
A: That’s a million dollar question. I think they are. I’m just speculating on this. Apple is very interested, we’ve had iTunes store. They’re understanding this is more to subscription model. They understand it’s going more to a cloud based model. I don’t have any magical insight into Apple.

Q: Let’s look at Spotify on this phone. I wanted to show this cool device. Sony Ericson X10 mini. Out in US in next couple of months. It’s an Android phone. We’ve installed Spotify. Now demonstrating the app. Has a spotify widget.

A: Over the next couple of weeks a lot of features coming in to Spotify. I hope from them moving in a more steady direction. We are listening to what users are asking us to do.

Q: US Launch? Also China?
A: The most important thing for us when it comes to US launch is that we want to build the best possible product we can and get all ducks in a row, partnerships with next gen of Spotify. Sort out publishing which is a huge task. Here you have to strike deals with almost 5000 publishers. Big thing for us is working on next gen of Spotify and getting that out there.

Q: How many plays equals one dollar?
A: Depends on the type on contract with the publisher/record labels. We share the rev we bring in. You can’t really equate to ‘per play’ we look at all our ad rev. Creates a bucket. For instance how do you account for a purchase of a song. There is no easy answer to your question. Over time our ad revs are growing, number of downloads growing. Amount of rev we bring in is growing.

Q: How are it working to convince American label that not everyone needs to be a subscriber for it to work.
A: This is the world’s biggest music market. We have potential reach of 170 mil people in Europe. America has much more. People spend more money in America. The whole industry is looking more and more about new opportunities. At the same time CD sales have been in decline, nothing online has been able to counter balance that decline. I think people are looking at how we can support Spotify, how do we ensure that people don’t stop buying CDs.



Now Nexus One Owners Can Bitch About AT&T Too (And This Won’t Help Sales)

Posted: 16 Mar 2010 10:47 AM PDT

There’s a lot of talk today about how the Nexus One’s initial roll-out has been a flop. And while the numbers aren’t official, things do look pretty grim for the first Android device Google is attempting to sell itself. But Google is wasting no time answering its critics â indirectly â with the launch of a version of the device that will work on AT&T’s 3G network.

To be clear, this isn’t Google teaming up with AT&T on the device. Instead, it’s simply a second version of the Nexus One that works with AT&T’s 3G frequency, which is different than that of T-Mobile’s (the current Nexus One U.S. carrier). The original Nexus One does actually already work on AT&T, but only for 2G connections, so this new version will obviously be significantly faster.

With the new 3G frequency, the new Nexus One will also work in Canada with Rogers Wireless. And, as Google notes, “And like the first version of the Nexus One, it can be used with most GSM operators globally.”

Certainly, giving consumers more choices is always a good thing, but it seems that Google’s attempt to sell the phone itself is really the problem here. While it makes sense that phones, like most other goods (digital cameras, for example), should be an easy sell online, there’s also some thought that the Nexus One isn’t selling well because customers are so used to walking into a store and playing with a phone for a bit before buying it. ÂIf that’s the case, the AT&T addition isn’t likely to help sales.

The right play here would be for Google to offer shoppers a full list of plan options for both T-Mobile and AT&T and let them decide which carrier to pick. Unfortunately, that won’t be happening here, because again, this new Nexus One is only being sold as an unlocked phone that can work on AT&T if you get a SIM card on your own (something which most consumers will never do in the U.S.).

Eventually, if Google can offer that list of options from all the carriers (including the CDMA ones like Verizon, which, yes, will require another version of the Nexus One), that could be enough to drive customers online to buy the phone (and has always been the Nexus One’s promise, in my opinion). This move today, won’t be. Also, with all the bitching about AT&T’s network by iPhone owners (though, again, it has been great at SXSW), why on Earth would anyone want to buy a smartphone to use on the network unless they absolutely had to (as they do with the iPhone)?

[photo: flickr/katybate]



Chatroulette Is 89 Percent Male, 47 Percent American, And 13 Percent Perverts

Posted: 16 Mar 2010 10:36 AM PDT

This is a guest post by Robert J. Moore, the CEO and co-founder of RJMetrics, an on-demand database analytics and business intelligence startup. His last guest post was an analysis of Twitter user data.

It’s no surprise that Chatroulette is the latest media darling. It has all the elements of a good story: technology, mystery, celebrity, and sex. If you haven’t heard of Chatroulette, this Daily Show segment is a good primer.

We were itching to study Chatroulette in a RJMetrics Dashboard, but no one seemed to have any good data for us to explore. So, we decided compile the data ourselves by leveraging Chatroulette Map, some scrappy programming, and a passionate tech community. We soon had detailed data on 2,883 Chatroulette sessions that tied users to geography, gender, appearance, and more.

Here are a few highlights from our findings:

  • About half of all Chatroulette spins connects you with someone from the USA. The next most likely country is France at 15%.
  • Of the spins showing a single person, 89% were male and 11% were female.
  • You are more likely to encounter a webcam featuring no person at all than one featuring a solo female.
  • 8% of spins showed multiple people behind the camera. 1 in 3 females appear as part of such a group. That number is 1 in 12 for males.
  • 1 in 8 spins yield something R-rated (or worse)
  • You are twice as likely to encounter a sign requesting female nudity than you are to encounter actual female nudity

How We Did It

Thanks to RJMetrics, the analysis was easy. Getting the data, however, was a bit of a challenge. The good news, however, is that a roulette wheel is the statistician’s best friend. The central limit theorem tells us that a large set of random observations allows us to draw high-confidence conclusions about the underlying data set.

We started our process at Chatroulette Map, an awesome new site that plots screenshots from random Chatroulette sessions on a map.

Chatroulette Map ties Chatters to Locations

It’s a little-known fact that anyone you chat with on Chatroulette can determine your IP address using a program like Wireshark. Chatroulette Map uses this IP data to geolocate and map random chatters on their website (along with still photos from their chats).

Chatroulette Map is also nice enough to expose all of its data points to anyone who clicks “View Source.” Right in the raw source code of their homepage is the image URL, latitude, longitude, city, state, and country of every chatter on their map. As an added bonus, the file name of each image is a UNIX timestamp of when it was taken. Jackpot. (Note: we tried contacting the creators of Chatroulette Map to participate in this story but did not receive a response.)

Once we had photos, times, and locations, we needed data on what was happening in each chat photo. We coded up a quick webpage that displayed a random photo from the data set and asked some basic multiple-choice questions about that photo. These included questions on age, gender, and what the person in the photo was doing. We coded up the backed so that a photo wouldn’t be taken out of rotation until two votes from different IP addresses provided an identical set of answers.

We posted the link to Hacker News on Saturday night. In under two hours, we received 10,770 photo assessments from 1,012 distinct IP addresses. Every photo received a corroborated profile. We had our data.

Five minutes later, the data was loaded into a hosted dashboard on RJMetrics and returning the results you see below.

Caveats

Before we get to the data, we should point out the uncontrolled inputs that could be skewing these results:

  • We know nothing about how Chatroulette matches up chatters, and we act on the assumption that pairings are truly random.
  • We know nothing about the methodology used by Chatroulette Map. If they excluded data points for any reason or did not sample randomly, our analysis could be skewed.
  • Geolocation by IP address is an imperfect science that is typically only accurate within a few dozen miles. It can also be thrown off by users taking advantage of proxy servers or using other techniques to disguise their IP addresses.
  • Human image recognition is imperfect (even if mitigated by our vote convergence system). Any images that were judged incorrectly could skew the results.
  • It’s also important to note that statistics about “the average chat session” (which we present here) are not the same as stats about “the average user.” For example, imagine if female chats averaged 100 seconds each, but male chats averaged 10 seconds each. Even if there were equal numbers of male and female users, males would enter the pool more often and would therefore appear in front of you more often, making the “average session” more likely to contain a male chat partner. Because of this, all of our statistics are about the average session and not the average user.

The Results

Gender

As you might expect, you’re most likely to encounter a solo male in any given chat session. 72% of our chat sessions were with solo males. Interestingly, 11% showed no person at all while only 9% showed a solo female. So, if you’re looking for women on Chatroulette, be forewarned: you’re more likely to encounter an empty chair.

Most Chat Partners were Male

Also interesting is the prevalence of groups on Chatroulette. In all, 8% of chats featured a group of people (4% all-male, 2% all-female, and 2% mixed). If you include groups, your chance of encountering a female grows to 13%. However, this means that if you do encounter a female, there is about a 1 in 3 chance that she will be part of a group. In contrast, the chance a male will be part of a group is only about 1 in 12.

Age

This analysis excludes cams where age could not be estimated. As you might expect, most people were young adults (about 70%). About 20% were under 20 and about 10% were 40 and older.

Most chat partners are young adults

When we combine age with the gender statistics that we tracked above, we learn even more. For example, females tended to be younger than males, with 23% under 20 (vs. 18% for males). Only 3% of females were over 40 (vs. 8% for males).

Groups of females were even younger. Female-only groups were “Teen or Younger” 65% of the time, while groups of males were “Teens or Younger” only 36% of the time. There were no groups whatsoever of people 40 or older.

Location

47% of the Chatroulette participants measured were from the United States. The most popular countries are shown below:

Most chatters are from the United States

When we combine geography with gender and age, we learn even more:

  • Italy had the highest concentration of solo males at 98%. It also had the highest concentration “Men over 40″ at 13% (more than 3x the US rate of 4%).
  • The US has the highest concentration of groups at 13%, followed by The Netherlands at 9%.
  • Canada had the highest concentration of solo females at 13%, followed by the US at 10%.

Perverts

If you’ve ever used Chatroulette, you probably noticed that not everyone is there just to chat. Some users, which we have affectionately labeled “perverts,” fit into any of these three categories:

  • Appear to not be wearing any clothes whatsoever
  • Are displaying explicit nudity
  • Appear to be committing a lewd act

The overall pervert rate in Chatroulette is 13%. This means about 1 in 8 chat sessions will have something decidedly Rated R (or NC-17) on the other end. Of the perverts that were identified, only 8% were female. Combined with the overall female rate, that means less than 1% of chats feature a female pervert.

Below, we see the “pervert rate” by country:

Chatroulette pervert concentration is the highest in the UK

The United Kingdom dominates the rankings here with a pervert concentration of 22%! Turkey, France, and Germany tie for second place with rates of 15%. Bringing down the global average is the United States, which boasts the lowest pervert concentration of the bunch: 10%.

Also worth mentioning are the users who display signs (like the one below) requesting female nudity.

Signs like this make up between 1% and 2% of all chats. This means that you’re twice as likely to encounter a sign requesting female nudity than you are to encounter actual female nudity.

Validation

In trolling through the thousands of photos collected by Chatroulette Map, I came across this extremely interesting image. It contains a statistical breakdown of what this user saw during his many Chatroulette chat sessions. Sound familiar?

These stats appear to be based on a data set of 1,090 points (pretty impressive for a single user). The numbers are generally in the same ballpark as ours (although we observed a higher pervert rate). We’re not sure who was behind this, but we like their style– they managed to sum up the gist of this blog post in a single image.

Conclusion

Scarcity of the data made this project both challenging and exciting. In an ideal world, it would be great to analyze things like average session length based on different attributes, chat user return rates, cohort analysis, and more. Because of the mostly-anonymous nature of Chatroulette, that data will be hard to come by. For now, at least you have a better idea of what you will see when you hit that Next button.

Guest author Robert J. Moore is the CEO of RJ Metrics, a startup that helps online businesses measure, manage, and monetize better. He was previously a venture capital analyst and currently serves as an advisor to several New York startups. Robert blogs at The Metric System and can be followed on Twitter at @RJMetrics.



Team Europe Makes Early Investment

Posted: 16 Mar 2010 09:26 AM PDT

Team Europe Ventures, the Berlin-based VC firm focused on early stage Internet companies, has made a minority investment in Infakt. The Polish startup provides web-based accounting and invoicing solutions for small companies locally.

Alongside Team Ventures, angel Christoph Janz also brings new investment, with the combined funding amounting to a 30% stake in Infakt. Polish business angel Krzysztof Nowinski (formerly with the VC firm BMP) is an existing investor.



Hey Twitter, are you going to deal with these Nazis or not?

Posted: 16 Mar 2010 09:23 AM PDT

There is a tweet being retweeted heavily within the German Twitter community right now which roughly translates as

“BEWARE Nazi-pigs on Twitter! @Heil_Hitler_88 Please block so that the account gets deleted. #nazi #block #rt Please!” (original).

Now, if Twitter had servers in Germany an account like @Heil_Hitler_88 (we’re not linking BTW) would be illegal and would be deleted right away.




--- End Message ---


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]