DB benchmark tool 은 안에 자동으로 키를 랜덤하게 생성을 하고 thread도 single thread, multi thread등을 생성하여
SSD에 데이터를 저장하고, 읽으면서 database을 benchmarking을 한다.
이제 levelDB 즉, 구글이 만든 이 DB benchmarking을 ssd에 해볼 생각이다.
그 전에 간단한 데이터베이스에 대한 알아보자
일단 요즘 인기 많아보이는 in memory database 즉 데이터 베이스가 메인메모리에 데이터를 저장하여 데이터를 관리하는 시스템이다. 자세한 설명은 이 webpage을 참고 하였다.
-An in-memory database (IMDB; also main memory database system or MMDB or memory resident database) is a database management system that primarily relies on main memory for computer data storage. It is contrasted with database management systems that employ a disk storage mechanism. Main memory databases are faster than disk-optimized databases since the internal optimization algorithms are simpler and execute fewer CPU instructions. Accessing data in memory eliminates seek time when querying the data, which provides faster and more predictable performance than disk.
key-value database
자세한 설명은 이 webpage을 참고 하였다.
- A key-value store, or Key value database, is a data storage paradigm designed for storing, retrieving, and managing associative arrarys, a data structure more commonly known todya as a dictionary or hash. Dictionaries contain a collection of object, or records, which in turn have many different fields within them, each containing data. These records are stored and retrieved using a key that uniquely identified the record, and is used to quickly find the data within the database.
now, as saying about levelDB, levelDB is key-value database using SSD(solid-state drive or rotating disk)
I refered to this webpage to benchmark SSD about levelDB.
basic setting
- Centos 7.2 / kernel 4.5
the below notifies the LevelDB refering to this site.
LevelDB is a light-weight, single-purpose library for persistence with bindings to many platforms.
this sorted by keys
- By default, LevelDB stores entries lexicographically stored by keys. The sorting is one of the main distinguishing features of LevelDB amongst similar embedded data strorage libraries and comes in very useful for querying as we’ll see later.
this is arbitrary byte arrays
- Both keys and values are treated as simple arrays of bytes, so content can anything from ASCII to binary blobs.
this compressed storage
- Google’s Snappy compression library is an optional dependency that can descrease the on-disk size of LevelDB stores with minimal sacrifice of speed. Snappy is highly optimized for fast compression and therefore does not provide particularly high compression ratios on common data.
on install process,
- 설치 방법은 기업용 리눅스를 제공해주는 회사에서 배포하는 배포판 levelDB를 다운받아 설치를 하는 것이다.
자세한 과정은 다음과 같다.
It refers to this site
- 공식 github site에서 release version을 받아서 설치를 하는 것이다.
자세한 과정은 다음과 같다.
it refers to this site
- sudo apt-get install libsnappy-dev
- wget https://leveldb.googlecode.com/files/leveldb-1.x.tar.gz
- tar -xzf leveldb-1.9.0.tar.gz
- cd leveldb-1.9.0
- make
- sudo mv libleveldb.* /usr/local/lib
- cd include
- sudo cp -R leveldb /usr/local/include
- sudo ldconfig
- 그냥 구글의 공식 github site에서 git clone을 해서 구글에서 제공해주는 소스를 활용하는 것이다.
git clone https://github.com/google/leveldb.git
- cd leveldb-1.x
- make
Reference site
-
[techoverfolw’s compling installing leveldb on linux]https://techoverflow.net/blog/2012/12/14/compiling-installing-leveldb-on-linux/
현재 아래 코드는 간단히 <a href = "https://github.com/google/leveldb">구글의 공식 사이트</a>에 소스를 git clone 을 하고 난다음 컴파일을 한후
테스한 결과들이다.
https://github.com/cBio/cbio-cluster/issues/309
[hyunyoung.lee@localhost out-static]$ ./c_test
=== Test create_objects
=== Test destroy
=== Test open_error
=== Test leveldb_free
=== Test open
=== Test put
=== Test compactall
=== Test compactrange
=== Test writebatch
=== Test iter
=== Test approximate_sizes
=== Test property
=== Test snapshot
=== Test repair
=== Test filter
=== Test cleanup
PASS
[hyunyoung.lee@localhost out-static]$ ./db_bench
LevelDB: version 1.18
Date: Fri Apr 1 14:16:21 2016
CPU: 4 * Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz
CPUCache: 3072 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
WARNING: Snappy compression is not enabled
------------------------------------------------
fillseq : 1.281 micros/op; 86.4 MB/s
fillsync : 4.735 micros/op; 23.4 MB/s (1000 ops)
fillrandom : 2.038 micros/op; 54.3 MB/s
overwrite : 2.459 micros/op; 45.0 MB/s
readrandom : 3.449 micros/op; (1000000 of 1000000 found)
readrandom : 3.154 micros/op; (1000000 of 1000000 found)
readseq : 0.174 micros/op; 636.5 MB/s
readreverse : 0.343 micros/op; 322.8 MB/s
compact : 301117.000 micros/op;
readrandom : 2.107 micros/op; (1000000 of 1000000 found)
readseq : 0.154 micros/op; 716.1 MB/s
readreverse : 0.294 micros/op; 376.6 MB/s
fill100K : 287.829 micros/op; 331.4 MB/s (1000 ops)
crc32c : 3.363 micros/op; 1161.6 MB/s (4K per op)
snappycomp : 3108.000 micros/op; (snappy failure)
snappyuncomp : 2989.000 micros/op; (snappy failure)
acquireload : 0.307 micros/op; (each op is 1000 loads)
[hyunyoung.lee@localhost out-static]$
위의 결과는 Snappy compression 를 이용하지 않을 때고,
이 Snappy compression을 이용하기 위해서 관련 software를 설치하겠다.
sudo yum install snappy-devel
We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:
#1) Respect the privacy of others.
#2) Think before you type.
#3) With great power comes great responsibility.
[sudo] password for hyunyoung.lee:
Sorry, try again.
[sudo] password for hyunyoung.lee:
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirror.fdcservers.net
* extras: mirror.scalabledns.com
* updates: mirrors.sonic.net
Resolving Dependencies
--> Running transaction check
---> Package snappy-devel.x86_64 0:1.1.0-3.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
===================================================================================================================================================================================================================
Package Arch Version Repository Size
===================================================================================================================================================================================================================
Installing:
snappy-devel x86_64 1.1.0-3.el7 base 14 k
Transaction Summary
===================================================================================================================================================================================================================
Install 1 Package
Total download size: 14 k
Installed size: 29 k
Is this ok [y/d/N]: y
Downloading packages:
snappy-devel-1.1.0-3.el7.x86_64.rpm | 14 kB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : snappy-devel-1.1.0-3.el7.x86_64 1/1
Verifying : snappy-devel-1.1.0-3.el7.x86_64 1/1
Installed:
snappy-devel.x86_64 0:1.1.0-3.el7
Complete!
[hyunyoung.lee@localhost out-static]$ ./db_bench
LevelDB: version 1.18
Date: Fri Apr 1 14:20:00 2016
CPU: 4 * Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz
CPUCache: 3072 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
WARNING: Snappy compression is not enabled
------------------------------------------------
fillseq : 1.429 micros/op; 77.4 MB/s
fillsync : 5.663 micros/op; 19.5 MB/s (1000 ops)
fillrandom : 2.099 micros/op; 52.7 MB/s
overwrite : 2.570 micros/op; 43.1 MB/s
readrandom : 3.822 micros/op; (1000000 of 1000000 found)
readrandom : 3.371 micros/op; (1000000 of 1000000 found)
readseq : 0.191 micros/op; 579.7 MB/s
readreverse : 0.370 micros/op; 298.7 MB/s
compact : 299751.000 micros/op;
readrandom : 2.289 micros/op; (1000000 of 1000000 found)
readseq : 0.162 micros/op; 682.0 MB/s
readreverse : 0.313 micros/op; 353.8 MB/s
fill100K : 307.209 micros/op; 310.5 MB/s (1000 ops)
crc32c : 3.384 micros/op; 1154.3 MB/s (4K per op)
snappycomp : 3065.000 micros/op; (snappy failure)
snappyuncomp : 2971.000 micros/op; (snappy failure)
acquireload : 0.318 micros/op; (each op is 1000 loads)
[hyunyoung.lee@localhost out-static]$ ^C
[hyunyoung.lee@localhost out-static]$ ^C
[hyunyoung.lee@localhost out-static]$
[hyunyoung.lee@localhost ~]$ g++ --version
g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
snappy-devel.x86_64 0:1.1.0-3.el7
Transaction Summary
================================================================================
Install 1 Package
Total download size: 14 k
Installed size: 29 k
Is this ok [y/d/N]: y
Downloading packages:
snappy-devel-1.1.0-3.el7.x86_64.rpm | 14 kB 00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : snappy-devel-1.1.0-3.el7.x86_64 1/1
Verifying : snappy-devel-1.1.0-3.el7.x86_64 1/1
Installed:
snappy-devel.x86_64 0:1.1.0-3.el7
Complete!
[hyunyoung.lee@localhost out-static]$ ./db_bench
LevelDB: version 1.18
Date: Fri Apr 1 14:31:26 2016
CPU: 4 * Intel(R) Core(TM) i3-4130 CPU @ 3.40GHz
CPUCache: 3072 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
------------------------------------------------
fillseq : 1.340 micros/op; 82.6 MB/s
fillsync : 5.820 micros/op; 19.0 MB/s (1000 ops)
fillrandom : 2.425 micros/op; 45.6 MB/s
overwrite : 3.043 micros/op; 36.4 MB/s
readrandom : 6.149 micros/op; (1000000 of 1000000 found)
readrandom : 5.756 micros/op; (1000000 of 1000000 found)
readseq : 0.264 micros/op; 419.7 MB/s
readreverse : 0.403 micros/op; 274.6 MB/s
compact : 377446.000 micros/op;
readrandom : 4.418 micros/op; (1000000 of 1000000 found)
readseq : 0.215 micros/op; 513.4 MB/s
readreverse : 0.359 micros/op; 307.9 MB/s
fill100K : 520.161 micros/op; 183.4 MB/s (1000 ops)
crc32c : 3.380 micros/op; 1155.6 MB/s (4K per op)
snappycomp : 4.435 micros/op; 880.7 MB/s (output: 55.1%)
snappyuncomp : 0.725 micros/op; 5389.8 MB/s
acquireload : 0.326 micros/op; (each op is 1000 loads)
자세히 보면 Snappy compression을 설치 후 WARNING: Snappy compression is not enabled 표시가 사라진 것을
확인을 할 수 있을 것이다.
==========================================================
google github leveldb를 보면 간단한 benchmark에 대한
정보를 얻을 수 있을 것이다.
구글 benchmarking code 간단한 요약
the following is at FLAGSS_bench structure.
// Comma-separated list of operations to run in the specified order
// Actual benchmarks:
// fillseq -- write N values in sequential key order in async mode
// fillrandom -- write N values in random key order in async mode
// overwrite -- overwrite N values in random key order in async mode
// fillsync -- write N/100 values in random key order in sync mode
// fill100K -- write N/1000 100K values in random order in async mode
// deleteseq -- delete N keys in sequential order
// deleterandom -- delete N keys in random order
// readseq -- read N times sequentially
// readreverse -- read N times in reverse order
// readrandom -- read N times in random order
// readmissing -- read N missing keys in random order
// readhot -- read N times in random order from 1% section of DB
// seekrandom -- N random seeks
// open -- cost of opening a DB
// crc32c -- repeated crc32c of 4K of data
// acquireload -- load N*1000 times
// Meta operations:
// compact -- Compact the entire DB
// stats -- Print DB stats
// sstables -- Print sstable info
// heapprofile -- Dump a heap profile (if supported by this port)
And static variable below is related to setting db_benching.
// Number of key/values to place in database
static int FLAGS_num = 1000000;
// Number of read operations to do. If negative, do FLAGS_num reads.
static int FLAGS_reads = -1;
// Number of concurrent threads to run.
static int FLAGS_threads = 1;
// Size of each value
static int FLAGS_value_size = 100;
// Arrange to generate values that shrink to this fraction of
// their original size after compression
static double FLAGS_compression_ratio = 0.5;
// Print histogram of operation timings
static bool FLAGS_histogram = false;
// Number of bytes to buffer in memtable before compacting
// (initialized to default value by "main")
static int FLAGS_write_buffer_size = 0;
// Number of bytes to use as a cache of uncompressed data.
// Negative means use default settings.
static int FLAGS_cache_size = -1;
// Maximum number of files to keep open at the same time (use default if == 0)
static int FLAGS_open_files = 0;
// Bloom filter bits per key.
// Negative means use default settings.
static int FLAGS_bloom_bits = -1;
// If true, do not destroy the existing database. If you set this
// flag and also specify a benchmark that wants a fresh database, that
// benchmark will fail.
static bool FLAGS_use_existing_db = false;
// If true, reuse existing log/MANIFEST files when re-opening a database.
static bool FLAGS_reuse_logs = false;
// Use the db with the following name.
static const char* FLAGS_db = NULL;
the next target is namesapce of levedb to cosist of class Randomgenerator.
namespace leveldb {
namespace {
// Helper for quickly generating random data.
class RandomGenerator {
private:
std::string data_;
int pos_;
public:
RandomGenerator() {
// We use a limited amount of data over and over again and ensure
// that it is larger than the compression window (32KB), and also
// large enough to serve all typical value sizes we want to write.
Random rnd(301);
std::string piece;
while (data_.size() < 1048576) {
// Add a short fragment that is as compressible as specified
// by FLAGS_compression_ratio.
test::CompressibleString(&rnd, FLAGS_compression_ratio, 100, &piece);
data_.append(piece);
}
pos_ = 0;
}
Slice Generate(size_t len) {
if (pos_ + len > data_.size()) {
pos_ = 0;
assert(len < data_.size());
}
pos_ += len;
return Slice(data_.data() + pos_ - len, len);
}
};
코드를 분석하면서 보니 block device에 직접 관리를 해서 하는 것이 아닌 단지 filesystem만 활용해서 데이터베이스를 만들고
갱신을 하는 시스템인거 같다.
그리고 RocksDB는 block device에 직접 데이터를 쓰면서 할수 있고 그리고 파일시스템에 데이터 베이스를 만들 수 있는 거
같다. 이를 분석을 해봐야 겠다.
What I assumed is wrong, After I checked the rocksDB, I realize RocksDB cannot also write data to device directly