Dealing with Big Data and Network Analysis Using Neo4j

Key Info
Description - a brief synopsis, abstract or summary of what the learning resource is about: 

In this lesson, you will learn how to use a graph database to store and analyze complex networked information. Networks are all around us. Social scientists use networks to better understand how people are connected. This information can be used to understand how things like rumors or even communicable diseases can spread throughout a community of people.
This tutorial will focus on the Neo4j graph database and the Cypher query language that comes with it.
-Neo4j is a free, open-source graph database written in java that is available for all major computing platforms.
-Cypher is the query language for the Neo4j database that is designed to insert and select information from the database.
By the end of this lesson you will be able to construct, analyze, and visualize networks based on big — or just inconveniently large — data. The final section of this lesson contains code and data to illustrate the key points of this lesson.

Authoring Person(s) Name: 
Jon MacKay
Authoring Organization(s) Name: 
The Programming Historian
License - link to legal statement specifying the copyright status of the learning resource: 
Creative Commons Attribution 4.0 International - CC BY 4.0
Access Cost: 
No fee
Citation - format of the preferred citation for the learning resource: 
Jon MacKay, "Dealing with Big Data and Network Analysis Using Neo4j," The Programming Historian 7 (2018),
Primary language(s) in which the learning resource was originally published or made available: 
More info about
Keywords - short phrases describing what the learning resource is about: 
Big data
Data analysis
Data coding
Data formats
Data visualization
Database design
Database management
Network analysis
Social science data
Subject Discipline - subject domain(s) toward which the learning resource is targeted: 
Arts and Humanities
Life Sciences
Social and Behavioral Sciences
Published / Broadcast: 
Tuesday, February 20, 2018
ID - identifier that provides the means to locate the learning resource or its citation: 
Type - namespace prefix for the citable locator, if any: 
Publisher - organization credited with publishing or broadcasting the learning resource: 
The Programming Historian
Media Type - designation of the form in which the content of the learning resource is represented, e.g., moving image: 
Interactive Resource - requires a user to take action or make a request in order for the content to be understood, executed or experienced.
Educational Info
Purpose - primary educational reason for which the learning resource was created: 
Instruction - detailed information about aspects or processes related to data management or data skills.
Learning Resource Type - category of the learning resource from the point of view of a professional educator: 
Learning Activity - guided or unguided activity engaged in by a learner to acquire skills, concepts, or knowledge that may or may not be defined by a lesson. Examples: data exercises, data recipes.
Target Audience - intended audience for which the learning resource was created: 
Citizen scientist
Data manager
Data professional
Early-career research scientist
Graduate student
Mid-career research scientist
Research faculty
Research scientist
Software engineer
Technology expert group
Intended time to complete - approximate amount of time the average student will take to complete the learning resource: 
More than 1 hour (but less than 1 day)