Showing posts with label data analysis. Show all posts
Showing posts with label data analysis. Show all posts

Sunday, May 20, 2012

Big data analysis and politics

Big data analysis is popping up in more contexts, and what better place in an election year than in politics. A ‘barcamp’ for campaign finance took place at Stanford May 19-20, 2012 where a hundred erstwhile participants quietly busied themselves in ten teams addressing different ways to apply big data and informatics to analyzing and improving the political process.

At the weekend hackathon, DataFest: Analyzing campaign finance data, participants learned some basic R, scripting, and quantitative data techniques to analyze data from websites like Influence Explorer. One project, for example, attempted to determine if House of Representatives Members who changed their vote at the last minute might have been influenced by campaign donations, or other factors obtained through data-mining techniques such as constituent demographics, ideology, and party voting record.

Sunday, February 05, 2012

The big data era's flux and pulse

Big data is an important contemporary trend but what does it actually mean?

What is big data?
Big data refers not just to the absolute size of a body of information (which currently can be on the order of terabytes, petabytes, and exabytes), but its usability and manageability. Some of the defining parameters of big data are its large size, high velocity activity (incoming, processing, outgoing), heterogeneous nature (a variety of structured and unstructured data types like video and images), and requirement for real-time analytics.

What is the process of working with big data?
The process of working with big data involves several steps. First there may be an exploration of the data using tools for classification, visualization, and summarization. Then there is the detailed step of data cleaning to make the data consistent and usable. The next step is data reduction, for example defining and extracting attributes, decreasing the dimensions of data, representing the problems to be solved, summarizing the data, and selecting portions of the data for analysis. Then, the steps of predictive analytics, scoring, reporting, publishing, and quality validation and maintenance can be applied.

What are the applications of big data analysis?
Some of the benefits of big data analysis are the ability to summarize information, make predictions, identify trends (for example, consumer spending patterns), and rank and prioritize information. Some of the specific algorithms employed include for summarizing: clustering and associations; for making predictions: tree-based methods, neural networks, and k-nearest neighbors; for identification: anomaly detection, similarities and matches, and change detection; and for ranking: logistics and frequency detection.

Excerpted from an Association for Computing Machinery (ACM) talk on Big Data & Predictive Analytics (slides).

Sunday, May 09, 2010

The big data graph era

With the start of the big data era and the ability to collect, store and render meaningful numerous data points, the cultural outlook of the world is shifting too. Graphs, graphs, graphs. Individuals and communities have a social graph, taste graph, preference graph, affinity graph, attention graph, intention graph, values graph, emotion graph, health graph and more.

Graphing theory is being applied to many new contexts such as social networks, media consumption, nanotechnology fabrication, gaming, and genomic analysis and could be one of the many data analysis techniques applied to any large dataset. VLDS – very large datasets – and moving back into the cloud mean that sophisticated data analysis and artificial intelligence techniques could be an expected feature of websites just like social networking commentary and gaming elements have become today.