Distributed file storage database MongoDB

Posted by nmreddy on Tue, 11 Jan 2022 01:59:56 +0100

Introduction to MongoDB

(article reprinted from Le byte)

Mongo does not mean Mango, but comes from the word Humongous.

MongoDB is a NoSQL database based on distributed file storage. Written in C + +. It aims to provide scalable high-performance data storage solutions for WEB applications. About what is NoSQL, you can read what is NoSQL after learning so many NoSQL databases

MongoDB is a product between relational database and non relational database. It is the most functional and relational database among non relational databases.

MongoDB uses BSON (Binary JSON) objects to store. Similar to key/value pairs in JSON format, field values can contain other documents, arrays and document arrays. The supported query language is very powerful, and its syntax is a bit similar to the object-oriented query language. It can almost realize most of the functions similar to the single table query of relational database, and also support the indexing of data.

MongoDB history

In 2007, Dwight Merriman, Eliot Horowitz and Kevin Ryan established 10gen software company. At the beginning of its establishment, the company's goal was to enter the cloud computing industry and provide cloud computing services for enterprises. When developing cloud computing products, they are ready to develop a database like component to provide storage services for cloud computing products. At that time, relational database dominated the world. They felt that the traditional relational database could not meet their requirements. They wanted a data storage product that programmers could use without understanding SQL language.

After looking around the network, no matter open source or closed source products, they didn't find anything to their satisfaction. Since they can't find them, they can develop them by themselves. Anyway, they also have that technical strength. The founders of 10gen are all from Google. The online advertising company DoubleClick they created was acquired by Google. This is their second venture.

10gen didn't use relational database for a certain reason. When they were still at DoubleClick, they suffered from relational database. DoubleClick is an online advertising company serving many well-known companies in the United States. The company provides 400000 advertisements per second, but it often encounters difficulties in scalability and agility. Therefore, they often have to develop and use many custom data stores to solve the shortcomings of existing relational databases, which makes them very distressed.

Therefore, they decided to develop a database product to solve the problems they encountered in DoubleClick and provide storage services for their cloud computing products.

  • MongoDB was originally developed in 2007 by a New York based organization called 10gen, now known as MongoDB Inc
  • In 2009, after nearly two years of development, 10gen developed the prototype of MongoDB and officially named it MongoDB. At the same time, it established an open source community and operated MongoDB through the community.
  • MongoDB 1.0 was released in February 2009, providing most basic query functions.
  • MongoDB 1.2 was released in December 2009. Map reduce was introduced to support large-scale data processing.
  • The first real product of MongoDB began with MongoDB version 1.4 released in March 2010.
  • MongoDB 1.6 was released in August 2010 and introduced some main features, such as slicing for horizontal scaling, replica set with automatic failover capability, and IPv6 support.
  • MongoDB 2.1 was released on May 23, 2012. This version adopts a new architecture and contains many enhancements.
  • On June 6, 2012, MongoDB 2.0.6 was released, distributed document database.
  • MongoDB 2.2 was released in August 2012 and introduced aggregation pipeline, which can combine multiple data processing steps into one operation chain.
  • In 2013, MongoDB launched its first commercial version, MongoDB Enterprise Advanced.
  • On April 23, 2013, MongoDB 2.4.3 was released. This version includes some performance optimization, function enhancement and bug repair.
  • On August 20, 2013, MongoDB 2.4.6 was released, still focusing on performance optimization, function enhancement and bug repair.
  • MongoDB 3.0 was released in March 2015, including new WiredTiger storage engine, pluggable storage engine API, 50 replica set restrictions and security improvements. Later that year, version 3.2 was released to support document validation, partial indexing and some major aggregation enhancements.
  • In 2016, MongoDB launched Atlas service, which cooperates with public cloud service providers (Google, Microsoft Azure). This year, MongoDB broke out a very serious security gate event. Hackers deleted data through MongoDB's default listening address 0.0.0.0, and blackmailed through this vulnerability. They can recover data by paying bitcoin of 0.2 to 0.5.
  • In October 2017, MongoDB was successfully listed on the New York Stock Exchange through IPO on the occasion of its 10th anniversary. The opening price was US $24, the company's valuation reached US $1.6 billion, and obtained US $192 in financing.
  • MongoDB 3.6 was released in November 2017, which provides better support for multi collection connection query, change flow and document verification using JSON mode.
  • MongoDB 4.0 was released in June 2018. The release of this version has attracted extensive attention and provides cross document transaction processing capability. This is an important milestone, and MongoDB is ready for high data integrity requirements.
  • On March 18, 2019, Forrester awarded MongoDB NoSQL the title of leader.
  • MongoDB 4.2 was released in October 2019 to support distributed transactions.
  • As of October 2020, the community version of MongoDB is 4.4.1, with enhanced scalability and performance, reduced replication latency, enhanced availability and fault tolerance, enhanced query capability and ease of use, and updated functions of MongoDB cloud platform. MongoDB has gradually changed from a manufacturer focusing on database services to a manufacturer providing data platform services.

By 2020, MongoDB's global downloads had reached 110 million times. MongoDB currently has more than 2000 employees and more than 18000 paying customers, many of whom use MongoDB Atlas and MongoDB enterprise edition at the same time. Most large companies still use the community version in some internal scenarios. MongoDB Community Edition is still open source. Except for some key features, it is similar to MongoDB Enterprise Edition.

MongoDB support language

Comparison of MongoDB and relational database terms

SQL terminology and conceptsMongoDB terminology and concepts
databasedatabase
tablecollection
rowdocument or BSON document
columnfield
indexindex
table joinsembedded documents and linking
primary key Specify any unique column or column combination as primary key. (specify any unique column or combination of columns as the primary key)primary keyIn MongoDB, the primary key isautomatically set to the _id field. (in mongodb, the primary key is automatically set to the _idfield)
aggregation (e.g. group by)MongoDB provides three ways to perform aggregation: the aggregation pipeline, the map-reduce function, and single purpose aggregation methods. (aggregate operation)

MongoDB data type

data typedescribe
Stringcharacter string. The type of data commonly used to store data. In MongoDB, UTF-8 encoded strings are legal.
IntegerInteger value. Used to store values. According to the server you use, it can be divided into 32-bit or 64 bit.
BooleanBoolean value. Used to store Boolean values (true / false).
DoubleDouble precision floating point value. Used to store floating point values.
Min/Max keysCompare a value with the lowest and highest values of the BSON (binary JSON) element.
ArraysUsed to store an array or list or multiple values as a key.
TimestampTimestamp. Record the specific time when the document was modified or added.
ObjectUsed for embedded documents.
NullUsed to create a null value.
SymbolSymbol. This data type is basically equivalent to the string type, but the difference is that it is generally used for languages with special symbol types.
DateDate and time. Use the UNIX time format to store the current date or time. You can specify your own date and time: create a date object and pass in the year, month and day information.
Object IDObject ID. ID used to create the document.
Binary DataBinary data. Used to store binary data.
CodeCode type. Used to store JavaScript code in a document.
Regular expressionRegular expression type. Used to store regular expressions.

MongoDB download and installation

download

Select MongoDB Community Server community version on the page, and select the corresponding version according to my own system. I use the CentOS version myself. MongoDB has only the RedHat version, which can be downloaded and used.

CentOS is the abbreviation of Community ENTerprise Operating System. It can also be called Community ENTerprise Operating System. It is a distribution version of Linux operating system.

CentOS is not a new Linux distribution. It is a clone of Red Hat Enterprise Linux (hereinafter referred to as RHEL), an enterprise version of the red hat family. RHEL is a Linux distribution adopted by many enterprises. It can be used only after paying red hat, and can get the corresponding paid services, technical support and version upgrade. CentOS can build a Linux system environment like RHEL, but it does not need to pay any product and service fees to red hat. At the same time, it does not get any paid technical support and upgrade services.

Confirm whether this version of software supports your operating system.

install

Upload the resource to the server / usr/local/src, unzip it to / usr/local and rename it mongodb.

# Create mongodb directory
mkdir -p /usr/local/mongodb
# Unzip mongodb to the specified directory
tar -zxvf /usr/local/src/mongodb-linux-x86_64-rhel70-4.4.1.tgz -C /usr/local/
# Rename the decompression directory to mongodb
mv /usr/local/mongodb-linux-x86_64-rhel70-4.4.1/ /usr/local/mongodb

Create data / log directory

Create a folder for storing data and logs, modify its permissions, and increase read and write permissions.

# Create a directory for storing data
mkdir -p /usr/local/mongodb/data/db
# Create a directory to store logs
mkdir -p /usr/local/mongodb/logs
# Create logging file
touch /usr/local/mongodb/logs/mongodb.log

Start MongoDB

Foreground start

The default startup mode of MongoDB is foreground startup. The so-called foreground startup means that MongoDB will occupy the current terminal window after starting the process.

# Switch to the specified directory
cd /usr/local/mongodb/
# Foreground start
bin/mongod --dbpath /usr/local/mongodb/data/db/ --logpath /usr/local/mongodb/logs/mongodb.log --logappend --port 27017 --bind_ip 0.0.0.0
  • --dbpath: Specifies the directory where data files are stored
  • --logpath: Specifies the log file. Note that the specified file is not a directory
  • --logappend: log by appending
  • --Port: Specifies the port. The default value is 27017
  • --bind_ip: bind the service IP. If 127.0.0.1 is bound, it can only be accessed locally. The default is the local address

Background start

The so-called background startup is to start MongoDB as a daemon. Add -- fork to the command.

# Background start
bin/mongod --dbpath /usr/local/mongodb/data/db/ --logpath /usr/local/mongodb/logs/mongodb.log --logappend --port 27017 --bind_ip 0.0.0.0 --fork

Starting by command is not suitable for management. After all, the configuration of various parameters needs to be considered every time you enter a command. We can configure the startup parameters through the configuration file, and then start the service by specifying the configuration file, which is more convenient in the management of MongoDB.

configuration file

Add a mongodb in the bin directory Conf configuration file.

# Data file storage directory
dbpath = /usr/local/mongodb/data/db
# Log file storage directory
logpath = /usr/local/mongodb/logs/mongodb.log
# Log by appending
logappend = true
# The default port is 27017
port = 27017
# There is no restriction on the access IP address. The default is the local address
bind_ip = 0.0.0.0
# Enabled as a daemon, that is, running in the background
fork = true

start-up

# Switch to the specified directory
cd /usr/local/mongodb/
# Specify how the profile starts the service
bin/mongod -f bin/mongodb.conf

Client access

You can access the MongoDB server through mongo in the bin directory.

The command is: bin/mongo --host address of host connection (default 127.0.0.1) --port (default 27017)

[root@localhost mongodb]# bin/mongo
MongoDB shell version v4.4.1
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("2bf54fad-83bc-444c-8bee-166a224445b8") }
MongoDB server version: 4.4.1
---
The server generated these startup warnings when booting: 
        2020-10-21T10:47:44.855+08:00: ***** SERVER RESTARTED *****
        2020-10-21T10:47:47.024+08:00: Access control is not enabled for the database. Read and write access to data and configuration is unrestricted
        2020-10-21T10:47:47.024+08:00: You are running this process as the root user, which is not recommended
        2020-10-21T10:47:47.024+08:00: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'
        2020-10-21T10:47:47.024+08:00: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'
        2020-10-21T10:47:47.024+08:00: Soft rlimits too low
        2020-10-21T10:47:47.024+08:00:         currentValue: 1024
        2020-10-21T10:47:47.024+08:00:         recommendedMinimum: 64000
---
---
        Enable MongoDB's free cloud-based monitoring service, which will then receive and display
        metrics about your deployment (disk utilization, CPU, operation statistics, etc).

        The monitoring data will be available on a MongoDB website with a unique URL accessible to you
        and anyone you share the URL with. MongoDB may use this information to make product
        improvements and to suggest MongoDB products and deployment options to you.

        To enable free monitoring, run the following command: db.enableFreeMonitoring()
        To permanently disable this reminder, run the following command: db.disableFreeMonitoring()
---
> 

help command.

> help
	db.help()                    help on db methods
	db.mycoll.help()             help on collection methods
	sh.help()                    sharding helpers
	rs.help()                    replica set helpers
	help admin                   administrative help
	help connect                 connecting to a db help
	help keys                    key shortcuts
	help misc                    misc things to know
	help mr                      mapreduce

	show dbs                     show database names
	show collections             show collections in current database
	show users                   show users in current database
	show profile                 show most recent system.profile entries with time >= 1ms
	show logs                    show the accessible logger names
	show log [name]              prints out the last segment of log in memory, 'global' is default
	use <db_name>                set current database
	db.mycoll.find()             list objects in collection mycoll
	db.mycoll.find( { a : 1 } )  list objects in mycoll where a == 1
	it                           result of the last line evaluated; use to further iterate
	DBQuery.shellBatchSize = x   set default number of items to display on shell
	exit                         quit the mongo shell

db.version() to view version information.

> db.version()
4.4.1

show dbs view all databases.

> show dbs
admin   0.000GB
config  0.000GB
local   0.000GB

Here, we will simply test the access through the client. Later, we will list the client operation MongoDB database, collection, document, index, built-in function and other related operations in detail.

Close MongoDB

Foreground startup and shutdown

Use Ctrl + c to close.

Background startup and shutdown

Use the -- shutdown parameter to close the.

# Shutdown of command startup mode
bin/mongod --dbpath /usr/local/mongodb/data/db/ --logpath /usr/local/mongodb/logs/mongodb.log --logappend --port 27017 --bind_ip 0.0.0.0 --fork --shutdown
# Closing of profile startup mode
bin/mongod -f bin/mongodb.conf --shutdown

The kill Command closes

The process is forced to be closed by killing - 9. Generally, this method is not recommended.

# View the process information of mongodb running
ps -ef | grep mongodb
# kill -9 force off
kill -9 pid

The MongoDB function is closed

After connecting to the MongoDB service, switch to the admin database and shut down the service using related functions.

# Connect mongodb
bin/mongo
# Switch admin database
use admin
# Execute the following function (1 out of 2) to shut down the service
db.shutdownServer()
db.runCommand("shutdown")

environment variable

For each operation of MongoDB, you need to enter a specific directory, such as starting the service and connecting the client. Can you operate in any directory. Of course, the answer is yes. You only need to add the MongoDB related directory to the system environment variable.

First edit the system environment variable file through vim /etc/profile and add the following contents.

# Add environment variable
export MONGODB_HOME=/usr/local/mongodb
export PATH=$PATH:$MONGODB_HOME/bin

Then reload the system environment variables through source /etc/profile. In this way, MongoDB can be operated directly in any directory of the system.

This article explains some entry-level contents of MongoDB and teaches you how to download and install MongoDB based on Linux environment. Below, let's start with security issues, take a look at the painful experience and lessons caused by the non encryption of MongoDB, and teach you a wave of practical solutions by the way.

Make. Of course, the answer is yes. You only need to add the MongoDB related directory to the system environment variable.

First edit the system environment variable file through vim /etc/profile and add the following contents.

# Add environment variable
export MONGODB_HOME=/usr/local/mongodb
export PATH=$PATH:$MONGODB_HOME/bin

Then reload the system environment variables through source /etc/profile. In this way, MongoDB can be operated directly in any directory of the system.

This article explains some entry-level contents of MongoDB and teaches you how to download and install MongoDB based on Linux environment. Below, let's start with security issues, take a look at the painful experience and lessons caused by the non encryption of MongoDB, and teach you a wave of practical solutions by the way.
(article reprinted from Le byte)

Topics: Python Framework Project