Twitter Ranking Tweets With Machine Learning
Machine learning is entering production at Twitter as a way of ranking tweets and boosting engagement.
Twitter engineers this week unveiled the social media platform’s ranking algorithm driven by deep neural networks. In a blog post, company engineers said their approach leverages an in-house artificial intelligence platform that includes new modeling capabilities.
Among the results, wrote Nicolas Koumchatzky, a software engineer with Twitter’s AI team called Cortex, are “more relevant timelines now, and in the future, as this opens the door for us to use more of the many novelties that the deep learning community has to offer, especially in the areas of [natural language processing], conversation understanding and media domains.”
Currently, Twitter (NYSE: TWTR) timelines are arranged chronologically based on a user’s last visit. That alone is a daunting infrastructure task, the company notes. The new ranking algorithm gathers all tweets from accounts being followed by an individual user. It then scores those tweets using a relevance model, with the goal of predicting the most relevant comments. The highest-ranked tweets are then display at the top of a user’s timeline, Koumchatzky explained.
Twitter’s new ranking model takes into account factors such as the number of re-tweets, likes, the inclusion of images and video and other responses to a post. It also attempts to gauge a user’s past interactions with authors along with the “strength” and origin of the follower’s relationship to an author.
“This scoring step imposes an even greater computational demand on timelines serving infrastructure, as we are now scoring thousands of tweets per second to satisfy all of the timeline requests,” the Twitter engineer noted. “The unique challenge is to perform scoring quickly enough to instantly serve tweets back to the people viewing their timelines, yet have powerful enough models to allow for the best possible quality and future improvements.”
Along with prediction models, Twitter said similar requirements are applied to its machine learning frameworks. A set of tools is used to train and launch a prediction model, with particular attention given to training speed and scaling when dealing with very large datasets, much of it unstructured.
Based on the breadth of research into new AI algorithms and model architectures, Koumchatzky argued, “betting on a platform that natively supports deep learning and complex graphs is key to leveraging the promises of that work.”
The Cortex team of data scientists and machine-learning researchers works on Twitter’s deep learning platform. New members were added last year when Twitter acquired London-based AI startup Magic Pony Technology. These and other deals are part of a push by social media and other hyper-scalers to train rather than just program algorithms. It also underscores how AI technology is finding new automation applications required to organize huge datasets.
Having worked out most of the kinks in its deep learning models and platform, Twitter’s Koumchatzky said “online experiments have also shown significant increases in metrics such as tweet engagement and time spent on the platform.”
July 10, 2020
- Bobby Soni to Lead Hitachi Vantara’s Digital Infrastructure Business Unit as President
- CoronaSurveys Project to Measure COVID-19 Real-Time Impact Now Reaches 150 Countries
- AWS Announces General Availability of AWS IoT SiteWise
- UBS Launches Big Data Shareholder Activism Tool
- Snowflake Achieves Fedramp Moderate Authorization for Snowflake on AWS and Microsoft Azure Government
- Call For Papers Now Open For In-Memory Computing Summit 2020 Virtual Worldwide Conference
July 9, 2020
- Spectra Logic Publishes ‘Digital Data Storage Outlook 2020’
- MariaDB Announces $25M Funding Round to Scale SkySQL Operations
- Domo Updates its COVID-19 Global Tracker with National Paycheck Protection Program Data from the SBA
- Cloudian Launches Operations in Australia and New Zealand
- NHS Trusts Advance Use of Analytics to Manage Patient Infection Status, Staff Exposure During Pandemic
- cnvrg.io and NetApp Partner to Deliver MLOps Dataset Caching
- Columbia Professor Confronts Healthcare Inequality in Time of COVID-19
- Oracle Autonomous Database Now Available in Customer Data Centers
- Researchers Receive NIH Funding to Develop Data-Driven Strategies in COVID-19 Fight
- FingerMotion Launches Big Data Insurance Solution
July 8, 2020
- Circonus Announces Free 45-Day Trial of its Kubernetes Monitoring Solution
- Talend Donates Nearly $3M in Data Skills Courses, Technologies to Higher Education
- HNI Corporation Taps Ascend.io to Fuel Operational Analytics
- GridGain Announces Nebula Managed Service For Apache Ignite and GridGain In-Memory Computing Platforms
Most Read Features
- Big Data File Formats Demystified
- How to Build a Better Machine Learning Pipeline
- Nvidia Destroys TPCx-BB Benchmark with GPUs
- BI Tools — Are They Enough to Build a Data-Driven Culture?
- How COVID-19 Is Impacting the Market for Data Jobs
- Databricks Brings Data Science, Engineering Together with New Workspace
- Understanding Your Options for Stream Processing Frameworks
- Is Python Strangling R to Death?
- SAS Provides Big Data Solutions for… Bees?
- Databricks Cranks Delta Lake Performance, Nabs Redash for SQL Viz
- More Features…
Most Read News In Brief
- New Report Ranks Countries by COVID-19 Safety
- Spark 3.0 Brings Big SQL Speed-Up, Better Python Hooks
- IBM Brings Back a Netezza, Attacks Yellowbrick
- Blurred Lines: SAS and Microsoft To Go Deep in Analytics Partnership
- Researchers Explore Link Between American Individualism and Poor COVID-19 Response
- Data Prep Still Dominates Data Scientists’ Time, Survey Finds
- New Map Shows Hundreds of Counties in the COVID-19 Endgame — and Thousands on the Uptick
- NIH Launches Massive Initiative for COVID-19 Patient Data Analytics
- War Unfolding for Control of Elasticsearch
- Bitnine Looks to Scale PostgreSQL
- More News In Brief…
Most Read This Just In
- HSBC Joins Data Privacy Firm Privitar’s Series C Financing Round with $7M Investment
- D2iQ Unveils KUDO for Kubeflow to Accelerate Enterprise-Grade Machine Learning on Kubernetes
- SAS Debuts Tools to Gauge Risks and Impacts of Reopening
- Databricks Introduces Delta Engine, Acquires Redash
- Technology Aims to Provide Cloud Efficiency for Databases During Data-Intensive COVID-19 Pandemic
- Cloudera Debuts its Cloudera Data Platform Private Cloud
- BP Invests $5M in Geospatial Analytics Software Company Satelytics
- Alation Launches Data Governance Initiatives
- New Actian Vector for Hadoop Enables Real-time and Operational Analytics
- MariaDB Announces the General Availability of MariaDB Community Server 10.5
- More This Just In…