Our first 12 months with OrientDB – a GraphDB development journey

When I started Techifide, I had the vision of partnership between us and large organisations using our services, both the consultancy and the software development capability. I also wanted to work with the power that drives entrepreneurs to succeed and walk alongside startups or new products, providing them the best experience they could have in terms of consultancy and service for an affordable fair price.

To achieve Techifide’s goals we needed a team of capable developers and a set of tools and open-source computer languages that could deliver fast, reliable and robust pieces of software ready to work with a large volume of data The “capable developers” part was hard to find but we got there, while the technology was not so difficult: NodeJs, Elixir and Python for the back-end; AngularJs, ReactJs for the front-end. The big question then was, which database technology would suit my big data requirements.

Our database should be:

  • Easy to scale
  • Flexible to adapt to an evolving application’s requirements
  • Cost effective for startups
  • Capable to follow any path the application might take as it evolves

We found OrientDB: “The World’s First Distributed Multi-Model NoSQL Database with a Graph Database Engine”. Other than having several interesting features, being free was its most attractive selling point, and so opted for a series of extensive bench tests with its closest competitor (Neo4J) before we committed ourselves. Using a series of complex queries, OrientDB out-performed its competitor, processing data up-to 35 times faster. So, in a leap of faith (and based on many hours reading benchmarks and DB comparison online, such as this Slide Share), we adopted OrientDB as our primary database.

Starting with things we learned, the difficult bits

DB Import Functionality

While exporting a database is easy, importing can only be performed using the command line. If you come from the PHPMyAdmin world, then frustration awaits you, as Studio lacks this feature.

Immutability Principle vs OrientDB

The idea of keeping variables immutable in the code conflicts with the concept of having a mutable database. You will become very familiar with the following error message: “Cannot update the record because the version is not the latest”. That happens because your “immutable” variable has changed in the database but not in the code / or vice versa.

Now the good things and what excited us!

Data Migration Get a quote!

If you already have a database and want to migrate to OrientDB, no problem! Teleporter is ready for that task! It is easy to use, simple to configure and is what you need to ease the pain when migrating from a relational database. You can map Table to Class and Column to Attribute. Furthermore, you can change table and column names so that they follow Java or other standards.

SQL Language

I don’t think there are computer programmers that don’t know the basics of SQL language. SQL language is almost written in plain English and is easy to learn.

SELECT columnA, columnB FROM table t WHERE valueV > 10;

OrientDB uses SQL. There are small differences but they will not make any developer roll their eyes in despair. The full documentation can be found at http://orientdb.com/docs/last/SQL.html

Speed Get a quote!

OrientDB’s performance gain by comparison to relational databases is a WOW. We compared the result of OrientDB vs MySQL and we had, like stated above, OrientDB 35 times faster than MySQL. Realistically, performance improvements will vary from 5 to 20 times, depending on the RDBMS, volume of data and complexity of the query.

Edges

It’s worth mentioning, that whilst not unique to OrientDB, they are used to store more than links between elements, but they are perfect for storing user actions. Edges are great for storing user actions. Let’s assume a class “UserU Created RecordR”. You can store the timestamp in Created, so we know when it was created. Another scenario: “UserU Updated RecordR”. Updated can store the valuesBefore and the valuesAfter update, other than the timestamp.

MATCH Operator

Learning this operator is a necessity! It is easier and more performant than TRAVERSE and, in many cases, better than SELECT itself.

Object Oriented pattern

Neo4J and Arango don’t have the concept and you will love it in OrientDB. Other than keeping things organised within universes, it

  1. prevents repetitions speeding up development
    Lets imagine Class A has 10 attributes. Supposing Class B extends A, B has all the attributes of A by default. In Neo4J you would have to write all the attributes in Class A again in Class B;
  2. helps selecting data by universe/group
    Supposing you have a Class Property, a Class House that extends Property and another Class called Building, which extends House. The structure would look like this:
Property -> House -> -> Building
SELECT FROM Building # returns Buildings only.
SELECT FROM House # returns Houses and Buildings
SELECT FROM Property # returns Properties, Houses and Buildings

It is like zooming in and out your database. How cool is that!

It is Free and Open Source

The majority of startups have limited resources but need to be prepared for expansion. With OrientDB’s Apache 2 Licence, no money needs to be paid. You only need the Enterprise Licence when you are able to pay for it.

Relational database support

We are currently developing two applications that have relational data patterns and we adopted OrientDB. We found that development is much faster than using MySQL for example, because the database schema is more flexible, it returns JSON data and it performs painless SQL joins using the MATCH operator.