Research: A Study of “Churn” in Tweets and Real-Time Search Queries (Extended Version)
Applicability: “A Study of “Churn” in Tweets and Real-Time Search Queries (Extended Version)” offers unique insight into the temporal dynamics of term distribution which may hold implications the design of search systems. As the growing importance of real-time search brings with it several information retrieval challenges; this paper frames one such challenge, that of rapid changes to term distributions, particularly for queries.
Abstract: The real-time nature of Twitter means that term distributions in tweets and in search queries change rapidly: the most frequent terms in one hour may look very different from those in the next. Informally, we call this phenomenon “churn”. Our interest in analyzing churn stems from the perspective of real-time search. Nearly all ranking functions, machine-learned or otherwise, depend on term statistics such as term frequency, document frequency, as well as query frequencies. In the real-time context, how do we compute these statistics, considering that the underlying distributions change rapidly? In this paper, we present an analysis of tweet and query churn on Twitter, as a first step to answering this question. Analyses reveal interesting insights on the temporal dynamics of term distributions on Twitter and hold implications for the design of search systems.
Analysis: Summarized analysis from this paper includes observations on:
Authors: Prepared by Jimmy Lin and Gilad Misne of Twitter, Inc., “A Study of “Churn” in Tweets and Real-Time Search Queries (Extended Version)” is a prepared paper submitted and accepted by the 6th International AAAI Conference on Weblogs and Social Media (ICWSM 2012).
This entry was posted on Tuesday, June 5th, 2012 at 2:39 pm. It is filed under chronology, discover and tagged with research, social media. You can follow any responses to this entry through the RSS 2.0 feed.
Comments are closed.
Provided as a non-comprehensive overview of over 100 key and publicly announced eDiscovery related mergers, acquisitions and investments since 2001, the following listing highlights key industry activities through the lens of announcement date, acquired company, acquiring or investing company and acquisition amount (if known).
Taken from a combination of public market sizing estimations as shared in leading electronic discovery reports, publications and posts over time, the following eDiscovery Market Size Mashup shares general worldwide market sizing considerations for both the software and service areas of the electronic discovery market for the years between 2012 and 2017.
There was a time when people believed the earth was flat. Or that humans would never walk on the moon. Or that computers had no place in the law. But then the non-believers proved them wrong. The earth is round, men have walked on the moon, and it is hard to imagine practicing law without a computer.
What about technology-assisted review? Are there myths surrounding TAR that will fall by the wayside as we better understand the process? Will we look back and smile at what people believed about TAR way back then? Turns out, that is already happening. Here are five myths that early TAR adopters believed true but that modern TAR systems prove wrong.
Reasonability is a core concept in the law, right up there with the idea of justice itself. It not only permeates negligence law, it underlies discovery law as well. For instance, a party in litigation, and the attorneys representing them, are required to make reasonable efforts to find relevant documents requested. They are required to make efforts that are good enough to be considered reasonable. But lawyers and litigants are not required to make efforts beyond that; not required to make super-human, stellar efforts, and certainly not perfect efforts.
Beginning in early 2012 the topic of Technology-Assisted Review moved from expert-led explanations to mainstream mentions in legal community articles, opinions, surveys and reports. Provided for your research, review and consideration are a compilation of key headlines and links from online sources on the topic of Technology-Assisted Review from February, 2012, until now.
The data from my Enron review experiment shows that relatively high consistent relevance determinations are possible. The comparatively high overlap results achieved in this study suggest that the problem of inconsistent human relevance determinations can be overcome. All it takes is hybrid multimodal search methods, good software with features that facilitate consistent coding, good SME(s), and systematic quality control efforts, including compliance with the less is more rule.