Data Mining from Social Media

In the last post we learned some basic tools on how to scrap data from webpages and from PDF files. In this post, I would like to introduce you to Netlytic, an open source tools that allows you to mine data from social media. Social media are a great resource for data mining – it can provide you with a large picture of what issues people are talking bout, what words are they using to discuss issues, and who are the leaders of this conversation. Netlytic is a cloud-based text and social networks analyzer that can automatically summarize and discover social networks from online conversations on social media sites.

Key Features

  • Import data from several platforms, including Twitter, Facebook, Instagram, Youtube, etc.
  • Download csv file of datasets
  • Create subsets to look at network progress over time
  • Text analysis visualizations
  • Network analysis visualizations and measures
  • Customizable text categories

Screen Shot 2017-01-31 at 4.01.51 PM.png

It is very user-friendly. All you need is to create a free account. In the rest of the post I will explain how to use this software for mining data on social media. As an example I will collect and analyze data from Jack J. Valenti School of Communication Summit on Sport Media that was held today at the University of Houston. After you log into Netlytic, you will go to “New Database” tab. There you can selected between different social media platforms (Twitter, Facebook, Instagram, ect). For the purpose of this example, we will select data from Twitter. Further, you can name the dataset you are about to create, we will call this dataset “Sport Media Summit”. Then you can use different types of keywords associated with the topic of the data you are looking for. In our example, we will use the official hashtag for the summit #SSMHTX.

Screen Shot 2017-01-31 at 4.12.34 PM.png

Once you hit “Import”, your data will appear in a format that can be exported as a CSV file.

Screen Shot 2017-01-31 at 4.15.26 PM.png

Textual Analysis

You can perform a Textual analysis of your data to learn about important topics within the community through text analysis by creating three visualizations: keyword cloud, keyword stack, and graph categories map.

Top 10 Most Frequently Used Words – To use this feature:
1) Go to the “Text Analysis” step,
2) Click the “Analyze” button in the “KEYWORD EXTRACTOR” panel.

Text Analysis: Manual Categories – to use this feature:
1) Go to the “Text Analysis” step,
2) Click the “Analyze” button under the “Manual Categories” panel.

The images bellow reflect the data analysis of the twitter data on the Valenti School Sport Media Summit.

A more specific tutorial on how to perform textual analysis using keywords (video 1) and using categories (video two) can be found bellow:

Network analysis

In addition, you can also perform network analysis of your data to discover the connections and interactions of a group through social network analysis. Users are able to create name networks (who mentions whom) and chain networks (who replies to whom).

Network: Top 10 Posters Mentioned in Messages – to use this feature:
1) Go to the “Network Analysis” step,
2) Click the “Analyze” button in the “NAME NETWORK” panel.

The image bellow reflects the network analysis of the twitter data from Sport Media Summit.

Screen Shot 2017-01-31 at 4.30.28 PM.png

A more specific tutorial on how to perform network analysis can be found bellow:


You can find more Netlytic video tutorials in their Youtube channel.

In addition to Netlytic, there are other tools that provide you with similar features to mine social media data and analyze it in the cloud, such as Gephi and NodeXL.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s