Dear readers of our blog, we'd like to recommend you to visit the main page of our website, where you can learn about our product SQLS*Plus and its advantages.
 
SQLS*Plus - best SQL Server command line reporting and automation tool! SQLS*Plus is several orders of magnitude better than SQL Server sqlcmd and osql command line tools.
 

REQUEST COMPLIMENTARY SQLS*PLUS LICENCE

Enteros UpBeat offers a patented database performance management SaaS platform. It proactively identifies root causes of complex revenue-impacting database performance issues across a growing number of RDBMS, NoSQL, and deep/machine learning database platforms. We support Oracle, SQL Server, IBM DB2, MongoDB, Casandra, MySQL, Amazon Aurora, and other database systems.

NoSQL

16 September 2020

NoSQL

NoSQL is an approach to the implementation of scalable storage (database) of information with a flexible data model that differs from the classical relational DBMS.

In non-relational databases, the problems of scalability and availability, important for Big Data, are solved by atomicity and consistency.

Why do we need non-relational databases in Big Data: History of appearance and development

NoSQL databases are optimized for applications that need to process large amounts of data with different structures quickly, with low latency. Thus, non-relational storages are directly oriented to Big Data. However, the idea of databases of this type originated much earlier than the term “big data”, back in the 80s of the last century, during the first computers (mainframes) and was used for hierarchical directory services.

The modern understanding of NoSQL-DBMS emerged in the early 2000s, as part of the creation of parallel distributed systems for highly scalable Internet applications, such as online searches.

In general, the term NoSQL means “not only SQL”. (Not Only SQL), characterizing the branch from the traditional approach to database design. Initially, this was the name of the oppositional database created by Carlo Strozzi, which stored all data as ASCII-files, and instead of SQL-queries to access the data used shell scripts.

In the early 2000s, Google built its search engine and applications (Gmail, Maps, Earth, and other services), solving problems of scalability and parallel processing of large volumes of data. Thus, distributed file and coordination systems as well as column family storage based on the MapReduce computing model were created.

After Google Corporation published a description of these technologies, they became very popular among open-source software developers. As a result, Apache Hadoop was created and the main related projects were launched. For example, in 2007, another IT giant, Amazon.com, published articles about its highly available Amazon DynamoDB database.

Then, this race of NoSQL technologies for managing large data included many corporations: IBM, Facebook, Netflix, eBay, Hulu, Yahoo! and other IT companies with their proprietary and open solutions.

SQLS*Plus - Diversity of NoSQL Solutions

Diversity of NoSQL Solutions

What are NoSQL DBMS: the main types of non-relational databases

All NoSQL decisions are divided into 4 types:

Key-value

Key-value – the simplest variant of data storage that uses the key to access the value within a large hash table.

Such DBMS is used for image storage, creation of specialized file systems, as caches for objects, as well as in scalable Big Data systems, including gaming and advertising applications, and projects of Internet of Things (Internet of Things, IoT), including industrial (Industrial IoT, IIoT).

The most famous representatives of non-relational key-value type DBMS are Oracle NoSQL Database, Berkeley DB, MemcacheDB, Redis, Riak, Amazon DynamoDB, which support high separability, providing an unprecedented horizontal scaling, which is unattainable when using other types of databases.

Document oriented storage

Document oriented storage, where data represented by key-value pairs are compressed as a semi-structured document from tagged elements like JSON, XML, BSON and other similar formats. This model is well suited for catalogs, user profiles, and content management systems, where each document is unique and changes over time.

Therefore, document NoSQL DBMS is most often used in CMS systems, publishing, and documentary search. The brightest examples of document-oriented non-relational databases are CouchDB, Couchbase, MongoDB, eXist, Berkeley DB XML.

Column storage

Column storage, which stores information as a sparse matrix, with rows and columns used as keys. In the world of Big Data, column storage refers to databases such as the Column Family. In such systems, the values themselves are stored in columns (columns) represented in separate files.

Thanks to this data model, a large number of attributes can be stored in a compressed form, which speeds up the execution of database queries, especially the data search and aggregation operations. The availability of timestamps enables us to use such DBMS for the organization of counters, registration, and processing of events related to time: exchange analytics systems, IoT/IIoT applications, content management systems, etc. The most famous column database is Google Big Table, as well as Apache HBase and Cassandra based on it. This type also includes the less popular ScyllaDB, Apache Accumulo, and Hypertable.

Graph Storage

Graph Storage is a network database that uses nodes and ribs to display and store data. Since the edges of the graph are stored, bypassing the graph does not require additional calculations (as a connection in SQL). In this case, indexes are required to find the initial vertex of the bypass.

Usually, graphical DBMS supports ACID requirements and specialized query languages (Gremlin, Cypher, SPARQL, GraphQL, etc.). Such DBMS is used in communication-oriented tasks: social networks, fraud detection, public transport routes, road maps, network topologies.

Examples of graph databases: InfoGrid, Neo4j, Amazon Neptune, OrientDB, AllegroGraph, Blazegraph, InfiniteGraph, FlockDB, Titan, ArangoDB.

 

Types of NoSQL DBMS

Types of NoSQL DBMS

The good and bad of non-relational databases: the main advantages and disadvantages

In comparison with classical SQL databases, non-relational DBMS has the following advantages:

  • linear scalability – adding new nodes to the cluster increases the overall system performance;
  • flexibility, which allows operating semi-structured data, implementing, including full-text search in the database;
  • the ability to work with different views of information, including without specifying the data layout;
  • high availability due to data replication and other fault-tolerance mechanisms, in particular, sharing – automatic data division into different nodes of the network, when each server of the cluster is responsible only for a certain set of information, processing requests for its reading and writing. This increases the data processing speed and bandwidth of the application;
  • performance through optimization for specific types of data models (document, graph, column or “key-value”) and access templates;
  • wide functionality – own SQL-like query languages, RESTful-interfaces, API, and complex data types, for example, map, list, and struct, which allow processing many values at once.

The reverse side of the above-mentioned advantages is the following disadvantages:

  • the limited capacity of the built-in query language. For example, HBase provides only 4 functions of work with data (Put, Get, Scan, Delete), in Cassandra, there are no operations Insert and Join, despite the presence of SQL-like query language. To solve this problem, use third-party means of translating classic SQL-examples into the execution code for a specific non-relational database. For example, Apache Phoenix for HBase or the universal Drill;
  • difficulties in supporting all ACID requirements for transactions (atomicity, consistency, isolation, durability) due to the fact that NoSQL-DBMS instead of CAP-model (consistency, availability, separation resistance) rather corresponds to the BASE model (basic availability, flexible state, and final consistency);
  • However, some non-relational DBMS try to bypass this restriction with the help of configurable consistency levels, as we described in the Cassandra example. Similarly, Riak allows setting the required availability-coherence characteristics even for individual queries by specifying the number of nodes required to confirm the successful completion of the transaction. More details about CAP and BASE models will be presented in a separate article;
  • Strong application binding to a specific DBMS due to the specifics of the internal query language and flexible case-oriented data model;
  • lack of specialists in NoSQL databases in comparison with relational analogs.

Summing up the description of the main aspects of non-relational DBMS, it is worth noting some incorrectness of the “NoSQL vs SQL” query due to different architectural approaches and application tasks, which these IT tools are oriented at.

Traditional SQL databases perfectly cope with the processing of strictly typed information of not too large volume. For example, a local ERP system or cloud CRM. However, in the case of processing a large volume of semi-structured and unstructured data, i.e.

Big Data, in a distributed system, should be selected from a variety of NoSQL-storage, taking into account the specifics of the task itself. In particular, for independent solutions of the Internet of Things (Internet of Things), including industrial, perfectly suits Cassandra, which we discussed here.

And in the case of multi-level IT infrastructure based on Apache Hadoop, it is worth paying attention to HBase, which allows you to quickly, almost in real-time, to work with data stored in HDFS.

An Introduction To NoSQL Databases

 
Tags:

MORE NEWS

 

Preamble​​NoSql is not a replacement for SQL databases but is a valid alternative for many situations where standard SQL is not the best approach for...

Preamble​​MongoDB Conditional operators specify a condition to which the value of the document field shall correspond.Comparison Query Operators $eq...

5 Database management trends impacting database administrationIn the realm of database management systems, moreover half (52%) of your competitors feel...

The data type is defined as the type of data that any column or variable can store in MS SQL Server. What is the data type? When you create any table or...

Preamble​​MS SQL Server is a client-server architecture. MS SQL Server process starts with the client application sending a query.SQL Server accepts,...

First the basics: what is the master/slave?One database server (“master”) responds and can do anything. A lot of other database servers store copies of all...

Preamble​​Atom Hopper (based on Apache Abdera) for those who may not know is an open-source project sponsored by Rackspace. Today we will figure out how to...

Preamble​​MongoDB recently introduced its new aggregation structure. This structure provides a simpler solution for calculating aggregated values rather...

FlexibilityOne of the most advertised features of MongoDB is its flexibility.  Flexibility, however, is a double-edged sword. More flexibility means more...

Preamble​​SQLShell is a cross-platform command-line tool for SQL, similar to psql for PostgreSQL or MySQL command-line tool for MySQL.Why use it?If you...

Preamble​​Writing an application on top of the framework on top of the driver on top of the database is a bit like a game on the phone: you say “insert...

Preamble​​Oracle Coherence is a distributed cache that is functionally comparable with Memcached. In addition to the basic function of the API cache, it...

Preamble​​IBM pureXML, a proprietary XML database built on a relational mechanism (designed for puns) that offers both relational ( SQL / XML ) and...

  What is PostgreSQL array? In PostgreSQL we can define a column as an array of valid data types. The data type can be built-in, custom or enumerated....

Preamble​​If you are a Linux sysadmin or developer, there comes a time when you need to manage an Oracle database that can work in your environment.In this...

Preamble​​Starting with Microsoft SQL Server 2008, by default, the group of local administrators is no longer added to SQL Server administrators during the...