Real Massive User Data Data Science Stories

{alt_text}

LinkedIn leverages valuable data from real user identities to build innovative social products

LinkedIn has a bold vision for the future of big data and the web. During recent appearances at the 2011 South by Southwest festival and the Web 2.0 Expo, LinkedIn founder Reid Hoffman declared that Web 3.0 will be about data and the analytics drawn from it. With the onset of mobile, the amount of both active and passive data each online user generates is increasing. As data volume grows, as will our ability to pull actionable information and real insight from the data. This will transform how the web is used, as the real identities users established online during the web 2.0 era begin generating increasingly massive amounts of data.

With over 100 million, LinkedIn is in a prime position to access user data and leverage its insights. The company’s commitment to finding innovative and useful applications for the massive stores of user data can be found in products such as LinkedIn Skills, hyperlinked lists of skills that offer insights on how relevant particular skills are to a prospective employer, and what other skills are closely associated. Users can find people, companies, and jobs related to their skills. The information can then be mashed up with Wikipedia to gain greater insight and context. The master list of skills was built from the ground-up, based upon how users defined their own skillsets, requiring sophisticated analytics of the massive amounts of data generated by the users.

The result is a flexible tool that reflects how users choose to self-identify in the real professional world, rather than one that imposes such descriptions and definitions from the top-down. Other LinkedIn big data projects such as InMaps demonstrate the robust stores of user data at the company’s disposal, an innovative data visualization that maps the connections in a user’s entire professional universe.

This is just the start of the big data revolution. Every organization should be thinking about and leveraging data at all times.”Peter Skomoroch, Principal Data Scientist at LinkedIn

This devotion to data is nothing new for LinkedIn; in fact, it has driven the company since its inception in 2002. Nine years later, LinkedIn is adding one million new users per week, and the company recently had a wildly successful IPO. LinkedIn is driven to constantly iterate on the product, leveraging the highly valuable data generated by professionals who have established real identities on the service.

By elevating the role of the data scientist, LinkedIn has consistently provided its user base with innovative ways to network and build their resumes and professional reputations. This approach has guaranteed LinkedIn a competitive advantage, allowing the company to offer users professional tools before any other company in the social space.

The company's commitment to data-driven insight and product development is evidenced by its diverse data scientist group, which includes scientists from various backgrounds as well as people from a visualization background. This diverse group offers the company a unique competitive edge. “We combine [the data scientist group's] unique perspectives along with a focus on evidence and data and applying rigor, along with hacking and engineering chops, and mix that together and infuse it within different products in the organization, and you get some great results,” says Peter Skomoroch, Principal Data Scientist at LinkedIn.

LinkedIn sees this approach as not only the key to the next transformation for the web, but an example of how companies should operate, moving forward. “This is just the start of the big data revolution,” Skomoroch says. “The precedent we hope to set is that every organization should be thinking about and leveraging data at all times--how you're using that data to drive decisions. When you have to make a business decision, you should back it up with data. The data that is flowing through your systems can be used to help your users.”

Our partners