Neo4j Desktop is a really useful application for every graph enthusiast, developer or analyst who uses Neo4j on a regular basis.
Graphlytic's main goal is to make graph modeling and analytics for day-to-day operations as simple and straightforward as possible. We are constantly adding new features to the visualization and automation modules because we believe that working with graphs, answering questions based on graph data and task automation with graphs should be easy and accessible even without any or only very little technical knowledge.
Graphlytic is a graph analytics and visualization web application that can be installed in several ways and one of these ways is to install it in Neo4j Desktop for local usage. This article covers steps needed to install and run Graphlytic in Neo4j Desktop.
Installation Of Graphlytic In Neo4j Desktop
Supported platforms: macOS, Windows, and Ubuntu (latest versions). Basically, if you are able to run Neo4j Desktop on your machine you should be able to run Graphlytic.
Please contact us at support(at)graphlytic.biz or make a small post to Neo4j Community portal with any questions or suggestions on how to improve Graphlytic.
The prerequisite for installation of Graphlytic in Neo4j Desktop is the installation of Neo4j Desktop locally. The installation of Neo4j is pretty simple and beginner friendly. If you haven’t already, download, activate and start using Neo4j for free according to the step-by-step procedure on https://neo4j.com/developer/neo4j-desktop/.
After having Neo4j installed and activated you can focus on Graphlytic. The most common scenario for installing and running Graphlytic in Neo4j Desktop (shown in the short clip below):
- Run Neo4j application on your machine. Click on "Graph Application" icon (on the left side).
- Find the "Install Graph Application" row and enter the Graphlytic Desktop app URL: https://npm.graphlytic.biz/graphlytic-desktop and click "Install".
- Add Graphlytic to your Neo4j Desktop Project - click on "Add Application" tile.
- Click on "Add Graph" tile to start Neo4j Graph instance (or create a new one first like in the video below).
- Start Graphlytic by clicking on "Graphlytic Desktop" tile in application tiles area (top of the screen).
- Reindex the fulltext index - this is especially needed when you connect Graphlytic to any existing Neo4j graph, with data already loaded in the db. More information can be found in the next chapters of this post.
Here is a short clip of all steps in running Graphlytic Desktop with a blank Neo4j Graph:
Next steps and resources
So, what can you do with your freshly installed Graphlytic? There are several use cases where Graphlytic can be very helpful with it's features, e.g.:
- Graph Modeling - manual modeling or graph generated from different data sources.
- Pattern searching and visualization with simple build-in analytics.
- Visualization and analysis of graphs with parallel relationships - this is useful particularly for analysis of event logs and communication logs. There is a short video later in the article on this topic.
- Scheduled Jobs for automatic data update and graph manipulation.
We are striving to get the right balance between two opposite things - simple graph UI and support for complex tasks. We have achieved this by combination of extensive configuration options and bespoke customization. Graphlytic is ready to be used out of the box for any graph data but the true value is in configuration options like:
When you want to use the fultext index in Graphlytic, please configure it first and reindex the connected Neo4j graph. This is especially needed when Graphlytic is for the first time connected to an existing Neo4j graph - without reindexing the fulltext search will not work (only the first 10 nodes will be accessible). After that Graphlytic will automatically reindex any changes done in the graph.
Fulltext index configuration is accessible from these pages:
- "Search & Manage Data" - available in the main menu in the right part of the header
- "Visualization" - the fulltext configuration button is located in the header right next to the search input field
Step for reindexing the fulltext index (see picture below):
- Open the "Fulltext search configuration"
- Choose properties that will be indexed
- Click on "Start indexing"
Visualization, Style Mappers and Views
User can modify pretty much any aspect of the visualization with the UI but in most cases a common understanding and interpretation of the graph data is in place. This common interpretation can be used to create default (globally accessible for all users) styling objects like mappers and default visualization settings which is then used as a default setting every time the user creates a new visualization.
- Documentation : Visualization Settings, Style Mappers, Style Views,
- Video : Statistics, Layouts, Selections, Exploring
With this configuration, it's possible to create a repository of predefined views (queries). These views are then accessible for users on the Search page in the form of a tab that user can add from the repository with one click. There are two types of these views: query builder which returns data in the form of a paginated table with sortable columns and cypher query where the user inputs any cypher query and visualize the result.
Users, Groups and Application Permissions
Graphlytic is a web application where only defined users have access. Users can be grouped into groups () and these groups can have all sorts of things defined with Application Permission, like if users of this group can only read data or if they can also input data. If they can export data, share visualizations, change global settings, create jobs and more.
Data Access Management (or Data Security)
Every user group can have different permissions regarding which part of the graph (nodes and relationship) and which properties can be or can not be accessed by users of this group. This allows creating specialized user groups that have restricted privileges like can access the graph itself but can not access financial data stored in properties etc.
Graphlytic contains an ETL module (Extract, Transform, Load) which allows creating jobs in form of an XML document that defines a set of steps that are executed when the job is started (manually or scheduled with CRON like expressions). Every step can produce a dataset that is then passed as an input to the next step. Steps are defined as the usage of a driver with specific parameters. Graphlytic includes for instance drivers for CSV, Neo4j connection and Cypher execution, Mail, Groovy, Log, Text, XPath. New drivers can be inserted into Graphlytic installation, like when you need a specific JDBC driver or when you want to create your own driver in Java for some use case specific post-processing after data update.
- Documentation : Scheduled Jobs
Graphlytic Use Cases
We have successfully used a combination of configuration and customization in use cases such as Fraud Detection, IT Infrastructure Modeling, Communication Analysis, Source Code Refactoring, Workflow Analysis, Process Mining and more. Below are brief examples of two common scenarios where we used Graphlytic : Data Modeling and Communication Analysis.
Graph Modeling With Graphlytic
Video below shows how easy is to model graphs (nodes and relationships) with Graphlytic.
First I'm going to create four nodes - two of them with the "Company" label and the other two with the "Person" label. Then I'm going to create relationships based on the ownership structure - in this case Person 1 owning some part of both companies and Person 2 owning part of the Company 2. Next I'm going to add the "name" property to each node with values like "Person 1", "Person 2" etc.
After modeling I'm going to style the visualization a little bit and save it to my visualization for some later work or for sharing with other users.
Communication And Process Analysis With Graphlytic
Over the years of using graphs for workflow, process, and communication analysis we have developed a set of features in Graphlytic that allows us to do this kind of work using graph models with large numbers of parallel relationships. This kind of model has it's pros and cons but the pros are in our opinion really good and the cons are at least manageable.
Graph model in such case is really simple:
- In workflow or process analysis : nodes are representing states that analyzed entities can be in and relationships are representing events where some entity has changed its state. Such relationship has to have at least the entity_id and timestamp properties.
- In communication analysis : nodes are representing entities that can communicate (e.g. people or machines) and relationships are representing communication (e.g. call or message). Such relationship has to have at least the entity_id and timestamp properties.
Of course this approach can be used only if your events are connecting exactly two nodes which is not always the case but we have found that most of the time it can be used and the result has some nice features:
- Simple graph model - most of the time there is only 1 type of node in the graph (1 label) e.g. "Person" in communication analysis or "State" in workflow/process analysis.
- Very few transformations during import and update of graph data - basically what we are importing are logs and every row from such log is represented with one relationship in the graph. This leads to easy data updating with a much lower chance of getting an inconsistent state in the graph.
- More data visualized with smaller and easy to understand graphs - unlike in traditional models used e.g. for fraud detection where events are modeled with nodes, the parallel model we use is roughly one-third of the size in number of nodes and relationships needed to communicate the same amount of source data.
Features implemented in Graphlytic for parallel models:
Virtual Relationship Models
it's not possible to effectively work with such model in visualization because the parallel relationships are cluttering the visualization. Graphlytic has a feature exactly for such case - Virtual Relationships models. With one click it's possible to merge parallel relationships into one relationship in the visualization representing all the parallel relationships. It's possible to merge all parallel relationships without considering the direction or it's possible to merge parallel relationships with the same direction (then there can be max two relationships between any two nodes with opposite direction).
- Documentation : Tools Panel
- Documentation : Virtual Properties
When the timestamp is stored on every relationship (date and time of the event occurrence) Graphlytic's Timeline feature can be used to visualize only some time interval. This way it's quite easy to compare visualizations for different time periods like months or days.
- Documentation : Timeline
This is how parallel model looks like in Neo4j browser:
This is how parallel models are handled in Graphlytic: