Research: A Study of “Churn” in Tweets and Real-Time Search Queries (Extended Version)
Applicability: “A Study of “Churn” in Tweets and Real-Time Search Queries (Extended Version)” offers unique insight into the temporal dynamics of term distribution which may hold implications the design of search systems. As the growing importance of real-time search brings with it several information retrieval challenges; this paper frames one such challenge, that of rapid changes to term distributions, particularly for queries.
Abstract: The real-time nature of Twitter means that term distributions in tweets and in search queries change rapidly: the most frequent terms in one hour may look very different from those in the next. Informally, we call this phenomenon “churn”. Our interest in analyzing churn stems from the perspective of real-time search. Nearly all ranking functions, machine-learned or otherwise, depend on term statistics such as term frequency, document frequency, as well as query frequencies. In the real-time context, how do we compute these statistics, considering that the underlying distributions change rapidly? In this paper, we present an analysis of tweet and query churn on Twitter, as a first step to answering this question. Analyses reveal interesting insights on the temporal dynamics of term distributions on Twitter and hold implications for the design of search systems.
Analysis: Summarized analysis from this paper includes observations on:
Authors: Prepared by Jimmy Lin and Gilad Misne of Twitter, Inc., “A Study of “Churn” in Tweets and Real-Time Search Queries (Extended Version)” is a prepared paper submitted and accepted by the 6th International AAAI Conference on Weblogs and Social Media (ICWSM 2012).
This entry was posted on Tuesday, June 5th, 2012 at 2:39 pm. It is filed under chronology, discover and tagged with research, social media. You can follow any responses to this entry through the RSS 2.0 feed.
Comments are closed.
An abridged look at the business of eDiscovery mergers, acquisitions, and investments. The presented listing highlights key industry business moves by sharing the announcement date, acquired company, acquiring or investing company, and acquisition amount (if known) of significant eDiscovery-related mergers, acquisitions, and investments.
Taken from a combination of public market sizing estimations as shared in leading electronic discovery reports, publications and posts over time, the following eDiscovery Market Size Mashup shares general worldwide market sizing considerations for software and services in the electronic discovery market for the years between 2016 and 2021.
One of the core purposes of all of the Tracks is to demonstrate the robustness of core retrieval technology. Moreover, one of the primary goals of TREC is: [T]o speed the transfer of technology from research labs into commercial products by demonstrating substantial improvements in retrieval methodologies on real-world problems.
The proceedings of the TREC Total Recall Track have been published by the National Institute of Standards and Technology. The purpose of track was to investigate methods and technologies to find, as nearly as possible, all documents in a collection that satisfy specific criteria, with reasonable effort.
Best Practices for eDiscovery Searching: A Continuing Legal Education (CLE) On-Demand Presentation (1.0 Hour) prepared and presented by CloudNine. This CLE-approved webcast session will cover goals for effective searching, what to consider prior to collecting ESI that will be subject to search, mechanisms for culling prior to searching, mechanisms for improving search recall and precision, challenges to effective searching and recommended best practices for searching and validating your search results to ensure effective search results.
This is the e-Discovery Team’s training course on how to do TAR (Technology Assisted Review). What TAR really means is electronic document review enhanced by active machine learning, a type of specialized Artificial Intelligence. Our method of AI-enhanced document review is called Hybrid Multimodal IST Predictive Coding 4.0. The Course is composed of sizteen classes.
What Every Attorney Should Know About eDiscovery in 2017: A Continuing Legal Education (CLE) On-Demand Presentation (1.0 Hour) prepared and presented by CloudNine. This webcast session will cover key terms, rules, duties, and case law to give you the tools and resources necessary to efficiently and effectively meet the challenging discovery obligations that attorneys will face in 2017.
ComplexDiscovery | Creative Commons Attribution 4.0 International