bitcask-mirror/README.md
2021-07-22 19:44:29 +10:00

241 lines
9.3 KiB
Markdown

# bitcask
[![Build Status](https://ci.mills.io/api/badges/prologic/bitcask/status.svg)](https://ci.mills.io/prologic/bitcask)
[![Go Report Card](https://goreportcard.com/badge/git.mills.io/prologic/bitcask)](https://goreportcard.com/report/git.mills.io/prologic/bitcask)
[![Go Reference](https://pkg.go.dev/badge/git.mills.io/prologic/bitcask.svg)](https://pkg.go.dev/git.mills.io/prologic/bitcask)
A high performance Key/Value store written in [Go](https://golang.org) with a predictable read/write performance and high throughput. Uses a [Bitcask](https://en.wikipedia.org/wiki/Bitcask) on-disk layout (LSM+WAL) similar to [Riak](https://riak.com/)
For a more feature-complete Redis-compatible server, distributed key/value store have a look at [Bitraft](https://git.mills.io/prologic/bitraft) which uses this library as its backend. Use [Bitcask](https://git.mills.io/prologic/bitcask) as a starting point or if you want to embed in your application, use [Bitraft](https://git.mills.io/prologic/bitraft) if you need a complete server/client solution with high availability with a Redis-compatible API.
## Features
* Embedded (`import "git.mills.io/prologic/bitcask"`)
* Builtin CLI (`bitcask`)
* Builtin Redis-compatible server (`bitcaskd`)
* Predictable read/write performance
* Low latency
* High throughput (See: [Performance](README.md#Performance) )
## Is Bitcask right for my project?
__NOTE__: Please read this carefully to identify whether using Bitcask is
suitable for your needs.
`bitcask` is a **great fit** for:
- Storing hundreds of thousands to millions of key/value pairs based on
default configuration. With the default configuration (_configurable_)
of 64 bytes per key and 64kB values, 1M keys would consume roughly ~600-700MB
of memory ~65-70GB of disk storage. These are all configurable when you
create a new database with `bitcask.Open(...)` with functional-style options
you can pass with `WithXXX()`.
- As the backing store to a distributed key/value store. See for example the
[bitraft](https://git.mills.io/prologic/bitraft) as an example of this.
- For high performance, low latency read/write workloads where you cannot fit
a typical hash-map into memory, but require the highest level of performance
and predicate read latency. Bitcask ensures only 1 read/write IOPS are ever
required for reading and writing key/value pairs.
- As a general purpose embedded key/value store where you would have used
[BoltDB](https://github.com/boltdb/bolt),
[LevelDB](https://github.com/syndtr/goleveldb),
[BuntDB](https://github.com/tidwall/buntdb)
or similar...
`bitcask` is not suited for:
- Storing billions of records
The reason for this is the key-space is held in memory using a highly
performant and memory optimized adaptive radix tree thanks to
[go-adaptive-radix-tree](github.com/plar/go-adaptive-radix-tree) _however_
this means the more keys you have in your key space, the more memory is
consumed. Consider using a disk-backed B-Tree like [BoltDB](https://github.com/boltdb/bolt)
or [LevelDB](https://github.com/syndtr/goleveldb) if you intend to store a
large quantity of key/value pairs.
> Note however that storing large amounts of data in terms of value(s) is
> totally fine. In other wise thousands to millions of keys with large values
> will work just fine.
- Write intensive workloads. Due to the [Bitcask design](https://riak.com/assets/bitcask-intro.pdf?source=post_page---------------------------)
heavy write workloads that lots of key/value pairs will over time cause
problems like "Too many open files" (#193) errors to occur. This can be mitigated by
periodically compacting the data files by issuing a `.Merge()` operation however
if key/value pairs do not change or are never deleted, as-in only new key/value
pairs are ever written this will have no effect. Eventually you will run out
of file descriptors!
> You should consider your read/write workloads carefully and ensure you set
> appropriate file descriptor limits with `ulimit -n` that suit your needs.
## Development
```sh
$ git clone https://git.mills.io/prologic/bitcask.git
$ make
```
## Install
```sh
$ go get git.mills.io/prologic/bitcask
```
## Usage (library)
Install the package into your project:
```sh
$ go get git.mills.io/prologic/bitcask
```
```go
package main
import (
"log"
"git.mills.io/prologic/bitcask"
)
func main() {
db, _ := bitcask.Open("/tmp/db")
defer db.Close()
db.Put([]byte("Hello"), []byte("World"))
val, _ := db.Get([]byte("Hello"))
log.Printf(string(val))
}
```
See the [GoDoc](https://godoc.org/git.mills.io/prologic/bitcask) for further
documentation and other examples.
## Usage (tool)
```sh
$ bitcask -p /tmp/db set Hello World
$ bitcask -p /tmp/db get Hello
World
```
## Usage (server)
There is also a builtin very simple Redis-compatible server called `bitcaskd`:
```sh
$ ./bitcaskd ./tmp
INFO[0000] starting bitcaskd v0.0.7@146f777 bind=":6379" path=./tmp
```
Example session:
```sh
$ telnet localhost 6379
Trying ::1...
Connected to localhost.
Escape character is '^]'.
SET foo bar
+OK
GET foo
$3
bar
DEL foo
:1
GET foo
$-1
PING
+PONG
QUIT
+OK
Connection closed by foreign host.
```
## Docker
You can also use the [Bitcask Docker Image](https://cloud.docker.com/u/prologic/repository/docker/prologic/bitcask):
```sh
$ docker pull prologic/bitcask
$ docker run -d -p 6379:6379 prologic/bitcask
```
## Performance
Benchmarks run on a 11" MacBook with a 1.4Ghz Intel Core i7:
```sh
$ make bench
...
goos: darwin
goarch: amd64
pkg: git.mills.io/prologic/bitcask
BenchmarkGet/128B-4 316515 3263 ns/op 39.22 MB/s 160 B/op 1 allocs/op
BenchmarkGet/256B-4 382551 3204 ns/op 79.90 MB/s 288 B/op 1 allocs/op
BenchmarkGet/512B-4 357216 3835 ns/op 133.51 MB/s 576 B/op 1 allocs/op
BenchmarkGet/1K-4 274958 4429 ns/op 231.20 MB/s 1152 B/op 1 allocs/op
BenchmarkGet/2K-4 227764 5013 ns/op 408.55 MB/s 2304 B/op 1 allocs/op
BenchmarkGet/4K-4 187557 5534 ns/op 740.15 MB/s 4864 B/op 1 allocs/op
BenchmarkGet/8K-4 153546 7652 ns/op 1070.56 MB/s 9472 B/op 1 allocs/op
BenchmarkGet/16K-4 115549 10272 ns/op 1594.95 MB/s 18432 B/op 1 allocs/op
BenchmarkGet/32K-4 69592 16405 ns/op 1997.39 MB/s 40960 B/op 1 allocs/op
BenchmarkPut/128BNoSync-4 123519 11094 ns/op 11.54 MB/s 49 B/op 2 allocs/op
BenchmarkPut/256BNoSync-4 84662 13398 ns/op 19.11 MB/s 50 B/op 2 allocs/op
BenchmarkPut/1KNoSync-4 46345 24855 ns/op 41.20 MB/s 58 B/op 2 allocs/op
BenchmarkPut/2KNoSync-4 28820 43817 ns/op 46.74 MB/s 68 B/op 2 allocs/op
BenchmarkPut/4KNoSync-4 13976 90059 ns/op 45.48 MB/s 89 B/op 2 allocs/op
BenchmarkPut/8KNoSync-4 7852 155101 ns/op 52.82 MB/s 130 B/op 2 allocs/op
BenchmarkPut/16KNoSync-4 4848 238113 ns/op 68.81 MB/s 226 B/op 2 allocs/op
BenchmarkPut/32KNoSync-4 2564 391483 ns/op 83.70 MB/s 377 B/op 3 allocs/op
BenchmarkPut/128BSync-4 260 4611273 ns/op 0.03 MB/s 48 B/op 2 allocs/op
BenchmarkPut/256BSync-4 265 4665506 ns/op 0.05 MB/s 48 B/op 2 allocs/op
BenchmarkPut/1KSync-4 256 4757334 ns/op 0.22 MB/s 48 B/op 2 allocs/op
BenchmarkPut/2KSync-4 255 4996788 ns/op 0.41 MB/s 92 B/op 2 allocs/op
BenchmarkPut/4KSync-4 222 5136481 ns/op 0.80 MB/s 98 B/op 2 allocs/op
BenchmarkPut/8KSync-4 223 5530824 ns/op 1.48 MB/s 99 B/op 2 allocs/op
BenchmarkPut/16KSync-4 213 5717880 ns/op 2.87 MB/s 202 B/op 2 allocs/op
BenchmarkPut/32KSync-4 211 5835948 ns/op 5.61 MB/s 355 B/op 3 allocs/op
BenchmarkScan-4 568696 2036 ns/op 392 B/op 33 allocs/op
PASS
```
For 128B values:
* ~300,000 reads/sec
* ~90,000 writes/sec
* ~490,000 scans/sec
The full benchmark above shows linear performance as you increase key/value sizes.
## Support
Support the ongoing development of Bitcask!
**Sponsor**
- Become a [Sponsor](https://www.patreon.com/prologic)
## Contributors
Thank you to all those that have contributed to this project, battle-tested it,
used it in their own projects or products, fixed bugs, improved performance
and even fix tiny typos in documentation! Thank you and keep contributing!
You can find an [AUTHORS](/AUTHORS) file where we keep a list of contributors
to the project. If you contribute a PR please consider adding your name there.
## Related Projects
- [bitraft](https://git.mills.io/prologic/bitraft) -- A Distributed Key/Value store (_using Raft_) with a Redis compatible protocol.
- [bitcaskfs](https://git.mills.io/prologic/bitcaskfs) -- A FUSE file system for mounting a Bitcask database.
- [bitcask-bench](https://git.mills.io/prologic/bitcask-bench) -- A benchmarking tool comparing Bitcask and several other Go key/value libraries.
## License
bitcask is licensed under the term of the [MIT License](https://git.mills.io/prologic/bitcask/blob/master/LICENSE)