Researchers Turn Data into Dynamic Demographics
Aside from showing off how their travel, culinary and nightlife habits, users of the geolocated “check-in” service Foursquare could shed light on the character of a particular city and its neighborhoods.
Researchers at Carnegie Mellon University’s School of Computer Science say that instead of relying on stagnant, unyielding census and neighborhood zoning data to take the temperature of a given community, Foursquare checkin data can provide the much –needed layer of dynamic city life.
The researchers have developed developed an algorithm that takes the check-ins generated when foursquare members visit participating businesses or venues, and clusters them based on a combination of the location of the venues and the groups of people who most often visit them. This information is then mapped to reveal a city’s Livehoods, a term coined by the SCS researchers.
All of the Livehoods analysis is based on foursquare check-ins that users have shared publicly via social networks such as Twitter. This dataset of 18 million check-ins includes user ID, time, latitude and longitude, and the name and category of the venue for each check-in.
“Our goal is to understand how cities work through the lens of social media,” said Justin Cranshaw, a Ph.D. student in SCS’s Institute for Software Research.
The researchers analyzed data from foursquare, but the same computational techniques could be applied to several other databases of location information. The researchers are exploring applications to city planning, transportation and real estate development. Livehoods also could be useful for businesses developing marketing campaigns or for public health officials tracking the spread of disease.
For now, however, it’s being used to get a grip in the cultural and even class distinctions present in a community. For instance, in their study of Carnegie Mellon’s home in Pittsburgh, the researchers found that the Livehoods they identified sometimes spilled over existing neighborhood boundaries, or identified several communities within a neighborhood. The Pittsburgh analysis was based on 42,787 check-ins by 3,840 users at 5,349 venues.
For instance, “they found that the upscale neighborhood of Shadyside actually had two demographically distinct Livehoods — an older, staid community to the west and a younger, “indie” community to the east. Moreover, the younger Livehood spilled over into East Liberty, a neighborhood that long suffered from decay but recently has seen some upscale development.”
And how does this match up to the class and cultural viewpoints of a human observer? Right on… “That makes sense to me,” observed a 24-year-old resident of eastern Shadyside, one of 27 Pittsburgh residents who were interviewed by researchers to validate the findings. “I think at one point it was more walled off and this was poor (East Liberty) and this was wealthy (Shadyside) and now there are nice places in East Liberty and there’s some more diversity in this area (eastern Shadyside).”
Speaking of class divides, the limitations of the research shine through as a viable point of study themselves. Foursquare users tend to be young, urban professionals with smartphones. Consequently, areas of cities with older, poorer populations are nearly blank in the Livehoods maps—an indication of the class makeup—something potentially valuable when seeking new dwellings or pricing real estate, for instance.
Maps for New York (first map above), San Francisco (just above) and Pittsburgh are available on the project website, http://livehoods.org/. The team has added voting for the next city to be “checked.”
September 28, 2016
- New Features Introduced for Confluent Enterprise
- Basho Unveils Latest Versions of Riak TS and Riak KV
- Cask Partners With Tableau to Deliver Rapid Insights From Big Data
- Cloudera Announces New Technology Enhancements to Core Platform
- Continuum Analytics and IBM Partner to Advance Open Source Analytics for the Enterprise
- IBM Unveils Project DataWorks
- NSF Announces $10M in “Big Data Spokes” Awards
- SAP to Invest $2 Billion in IoT
- Podium Data Receives $9.5M in Financing
- Cloudera Reveals 2016 Data Impact Award Winners
- Cloudera Approves First Grant Applications for Precision Medicine Initiative
- Hortonworks to Showcase Latest Offerings at Strata + Hadoop World
- Calit2 Creates Pattern Recognition Laboratory
September 27, 2016
- BlueData Announces Fall Release for Enterprise Edition of EPIC Software
- BlueTalon Test Drive Unveiled
- Splice Machine Announces Native PL/SQL Support to Accelerate Migrations From Oracle to Hadoop
- Databricks Releases Findings of Annual Apache Spark Survey
- MapR Unveils Support for Event-Driven Microservices on Converged Data Platform
- ClearStory Data Makes Advancements to Spark-Based BI Platform
- Tableau Delivers APIs for Developers to Create New Experiences With Data Analytics
Most Read Features
- 9 Must-Have Skills to Land Top Big Data Jobs in 2015
- Which Type of SSD is Best: SATA, SAS, or PCIe?
- Spark Streaming: What Is It and Who’s Using It?
- Solr or Elasticsearch–That Is the Question
- Yahoo’s New Pulsar: A Kafka Competitor?
- Python Eats Into R as SAS Dominance Fades
- 5 Factors Driving the Graph Database Explosion
- 9 Paths to a Data Science Interview
- Apache Spark: 3 Real-World Use Cases
- Workforce Analytics: How Big Data Is Shaping the Labor Pool
- More Features…
Most Read News In Brief
- AWS Redshift Feels the Heat
- Why Gartner Dropped Big Data Off the Hype Curve
- Six Big Name Schools with Big Data Programs
- Veritas Disses Dell/EMC As It Preps Big Info Management Push
- MIT Programmers Attack Big Data Memory Gap
- Altiscale Deal Would Boost SAP Hadoop Offerings
- ‘Smart Machines’ Top the Hype Cycle, Gartner Says
- Huawei, Startup Collaborate on Big Data Object Storage
- U.S. Visa Program Would Scan Social Media Data
- SAP Debuts Free HANA Express Edition
- More News In Brief…
Most Read This Just In
- Datanami Reveals Winners of Inaugural Readers’ and Editors’ Choice Awards
- New Report Says Data Lakes Market to be Worth $8.81 Billion by 2021
- Continuum Analytics Teams Up With Intel for Python Distribution Powered by Anaconda
- Informatica to Expand Data Lake Management Solution
- Teradata Introduces Borderless Analytics
- SAP Launches BW/4HANA
- Cask Releases Preview of First Unified Integration Platform for Big Data
- Huawei and Alluxio Jointly Release Big Data Storage Acceleration Solution
- Elastic Acquires Prelert
- Munich Re Relying on SAS Analytics and HDP for Big Data Initiative
- More This Just In…
September 26 - September 29New York United States
October 19San Francisco CA United States
October 23 - October 27New York United States