Data Mining - Topics A-Z
Topics A-Z listing of articles and resources about data mining activities within government.
-
Data Mining - Articles - United States - Archive
-
Archived articles and resources about data mining in the United States.
-
What the NSA can't do with your data (probably)
- By Adam Mazmanian. Federal Computer Week, June 12, 2013. "The National Security Agency probably isn't spying on you. It's just measuring you for risk, according to two experts on the science of predictive analytics and data mining.
The NSA's PRISM program, it has now been revealed, collects communications data from leading online commercial services and collects metadata and envelope information from mobile providers, including Verizon..."
-
Mining Social Data to Create a Content Strategy
- by Simon Penson, Search Engine Watch, March 11, 2013. "Social is a hugely exciting space right now. For businesses it gives us, for the first time, the ability to connect with customers individually, and in an incredibly targeted way.
For search and content marketers it isn't just the platform but also the data behind it that makes it really exciting. So how can you make use of what will become the most valuable available pot of data?
You can apply the data in much of your strategic work to inform decision-making, but one of the most useful, and relevant currently, is in helping steer content ideas and strategy..."
-
Twitter, Facebook now tools for Big Brother
- By David Saleh Rauf. Politico, 4 April 2012. "Uncle Sam wants to read your tweets and Facebook updates — and, in some cases, already scours your feeds.
Federal agencies have realized they can mine social media for intel to help thwart potential terrorist strikes, keep tabs on domestic protests and better help citizens after a natural disaster. But privacy groups are clamoring for Congress to intervene, likening it to Big Brother..."
-
Everything You Wanted to Know About Data Mining but Were Afraid to Ask
- By Alexander Furnas. The Atlantic, April 3 2012. "A guide to what data mining is, how it works, and why it's important.
Big data is everywhere we look these days. Businesses are falling all over themselves to hire 'data scientists,' privacy advocates are concerned about personal data and control, and technologists and entrepreneurs scramble to find new ways to collect, control and monetize data. We know that data is powerful and valuable. But how?
This article is an attempt to explain how data mining works and why you should care about it..."
-
ATO seeks waiver to hunt data on taxpayers' investments
- by Sean Parnell, FOI Editor. The Australian, July 27, 2011. "The Australian Taxation Office wants a permanent exemption from privacy guidelines covering its data-matching operations to compile dossiers on taxpayers' property arrangements and investments.
The tax office is one of the commonwealth agencies leading the fight in the data wars, using largely automated systems to deal with tax fraud and assist other agencies in their investigations..."
-
Google Correlate Whitepaper - in pdf format (247kb)
- (This document requires the use of Adobe Acrobat Reader). by Matt Mohebbi, Dan Vanderkam, Julia Kodysh, Rob Schonberger, Hyunyoung Choi & Sanjiv Kumar, Draft Date: May 25, 2011. "Trends in online web search query data have been shown useful in providing models of real world phenomena. However, many of these results rely on the careful choice of queries that prior knowledge suggests should correspond with the phenomenon. Here, we present an online, automated method for query selection that does not require such prior knowledge. Instead, given a temporal or spatial pattern of interest, we determine which queries best mimic the data. These search queries can then serve to build an estimate of the true value of the phenomenon. We present the application of this method to produce accurate models of influenza activity and home refinance rate in the United States. We additionally show that spatial patterns of phenomenon and queries serving as temporal phenomenon can surface interesting and useful correlations..."
-
Google Correlate: More Search Data to Mine
- by Vanessa Fox. Search Engine Land, May 25, 2011. "... With Google Correlate, you can upload data charted over either time or space and Google will look for matching patterns in search volumes. If you don’t have data of your own to upload, you can simply specify search terms, and Google will calculate the trending pattern and show matching patterns.
As Google notes in their documentation, this is sort of the opposite of Google Trends..."
-
Mining patterns in search data with Google Correlate
- Posted by Matt Mohebbi, Software Engineer. The Official Google Blog, 25 May 2011. "... Using Correlate, you can upload your own data series and see a list of search terms whose popularity best corresponds with that real world trend..."
-
Scraping, cleaning, and selling big data
- Infochimps execs discuss the challenges of data scraping, by Audrey Watters. O'Reilly Radar, 11 May 2011. "In 2008, the Austin-based data startup Infochimps released a scrape of Twitter data that was later taken down at the request of the microblogging site because of user privacy concerns. Infochimps has since struck a deal with Twitter to make some datasets available on the site, and the Infochimps marketplace now contains more than 10,000 datasets from a variety of sources. Not all these datasets have been obtained via scraping, but nevertheless, the company's process of scraping, cleaning, and selling big data is an interesting topic to explore, both technically and legally.
With that in mind, Infochimps CEO Nick Ducoff, CTO Flip Kromer, and business development manager Dick Hall explain the business of data scraping in the following interview..."
-
Newly Declassified Files Detail Massive FBI Data-Mining Project
- By Ryan Singel. Wired Threat Level, September 23, 2009. "A fast-growing FBI data-mining system billed as a tool for hunting terrorists is being used in hacker and domestic criminal investigations, and now contains tens of thousands of records from private corporate databases, including car-rental companies, large hotel chains and at least one national department store, declassified documents obtained by Wired.com show..."
-
Twendz - Twitter data mining web application
- "twendz is a Twitter mining Web application that utilizes the power of Twitter Search, highlighting conversation themes and sentiment of the tweets that talk about topics you are interested in. As the conversation changes, so does twendz by evaluating up to 70 tweets at a time. When new tweets are posted, they are dynamically updated, minute by minute... twendz uses a keyword-based approach to score tweets. Meaningful words in each tweet are compared against a “dictionary” of thousands of words that are associated with positive or negative sentiment; each word receives a score that, when combined with the other scored words, allows twendz to make an educated guess at the overall tone of a tweet. After twendz scores a handful of tweets matching certain criteria, it extracts key terms, assigns a tone rating to each of those, and assembles them in a word cloud..."
-
Data.gov and lessons from the open-source world
- by Alan Noble. Government 2.0 Taskforce, August 26, 2009. "A previous blog bost talked about what data government departments should be releasing. In this post I like to talk about how to release it. One approach is to centralise things... Another approach is decentralised, and would be modeled on a 'bazaar'. In this approach, government web sites scattered around the Internet would utilise Web 2.0 technologies to provide data in both human and machine readable data and metadata formats..."
-
Visualization tools improve transparency by making sense of raw data
- By Joab Jackson. Government Computer News, August 24, 2009. "... For Data.gov, agencies have placed thousands of data feeds as comma-separated values (CSV) files or Really Simple Syndication feeds. Although that is a good step toward greater transparency, agencies could also start thinking about ways to better present the data so people and fellow government employees can make better sense of all the material. A new crop of visualization tools is coming on the market to make that job easier..."
-
Panel: Government data-mining programs need more scrutiny
- By Ben Bain. Federal Computer Week, October 7, 2008. "The federal government should systematically evaluate the effectiveness and lawfulness of homeland security-related data mining and behavioral surveillance programs before deployment, according to a new report from a committee of the National Research Council of the National Academies..."
-
Live data mining a step closer
- by Julian Bajkowski. The Australian Financial Review, 17 June 2008. "Government departments could soon get immediate feedback on how policies resonate with constituents after Human Services Minister Joe Ludwig flagged plans to make wider use of de-identified data mined from personal records..."
This category last updated: 13 June 2013