Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(badger): add support for Badger version 4. The default remains Badger version 2 to ensure backward compatibility. #12316

Open
wants to merge 66 commits into
base: feat/faster-datastore-with-badger
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 55 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
30fbab6
wip
snissn Jul 19, 2024
7a278cf
delete vim swaps
snissn Jul 19, 2024
bee642a
check in wip
snissn Jul 20, 2024
b077093
wip
snissn Jul 25, 2024
0e14b09
wip
snissn Jul 27, 2024
a4b7538
options
snissn Jul 27, 2024
6c4f012
clean
snissn Jul 27, 2024
1690d3d
merge fix
snissn Jul 27, 2024
4adc087
docsgen
snissn Jul 27, 2024
a3c1c2d
progress on options/logger
snissn Jul 29, 2024
da5f823
last usage of Send wrapped using ForEach
snissn Jul 29, 2024
e4ffc68
changelog
snissn Jul 29, 2024
b3ecb25
go mod tidy
snissn Jul 29, 2024
6aa56cc
default to 2
snissn Jul 29, 2024
94f3ce4
set default badger version to 2
snissn Jul 29, 2024
5aefc8e
prefix bugfix
snissn Jul 29, 2024
d680f6f
clean up lotus-bench for pr
snissn Jul 30, 2024
0daa826
make docsgen
snissn Jul 30, 2024
dafc880
lint fixes
snissn Jul 30, 2024
395be91
fix func signature
snissn Jul 30, 2024
91d5680
merge conflict resolve - confirm f6978f01725fc8f8ef72cdf83d15aa57b8e0…
snissn Jul 30, 2024
fd626f9
bugfix
snissn Jul 30, 2024
900d31f
unused lint fix
snissn Jul 30, 2024
d0143fb
lintfix
snissn Jul 30, 2024
1b1bbd0
debugging test
snissn Jul 30, 2024
55918db
wip - fixes tests
snissn Jul 31, 2024
a1d39a7
clean up creation of blockstore
snissn Jul 31, 2024
dd95024
connect badgerVersion config with creation of blockstores
snissn Jul 31, 2024
279f2ba
bugfixes
snissn Jul 31, 2024
9f82d83
bugfix for make buildall
snissn Jul 31, 2024
c01521f
autogen
snissn Jul 31, 2024
92c3f2d
bugfix import bench
snissn Jul 31, 2024
ab99ef0
get tests to work with v2
snissn Jul 31, 2024
feac779
add both versions to badger test
snissn Jul 31, 2024
3fd2707
fmt
snissn Jul 31, 2024
48f50a6
keep ctx in doCopy
snissn Jul 31, 2024
2cac6a6
gofmt
snissn Jul 31, 2024
df59d5d
gofmt
snissn Jul 31, 2024
2a65044
clamp unused
snissn Jul 31, 2024
50cabd0
use cfg.Chainstore with BadgerHotBlockstore
snissn Aug 12, 2024
cab56a8
bugfix test all badger versions
snissn Aug 12, 2024
6c92ba1
Merge branch 'master' into mikers/BadgerVersions
snissn Aug 12, 2024
c461160
remove dead code for DefaultOptions and restore comments and keep v2 …
snissn Aug 12, 2024
6346cc6
Merge branch 'mikers/BadgerVersions' of github.com:filecoin-project/l…
snissn Aug 12, 2024
52604e9
remove unused AllKeysChan
snissn Aug 12, 2024
4de5f71
remove unused DeleteBlock method
snissn Aug 12, 2024
d2de85e
Update node/config/types.go
snissn Aug 13, 2024
ee737c9
Update blockstore/badger/versions/badger.go
snissn Aug 13, 2024
6d5d679
item.Version is unused
snissn Aug 13, 2024
26610e7
Merge branch 'mikers/BadgerVersions' of github.com:filecoin-project/l…
snissn Aug 13, 2024
c973d48
Update blockstore/badger/versions/badger.go
snissn Aug 13, 2024
0dc3a85
Update blockstore/badger/versions/badger.go
snissn Aug 13, 2024
624779f
lint and add import
snissn Aug 13, 2024
c7e3a84
Merge branch 'mikers/BadgerVersions' of github.com:filecoin-project/l…
snissn Aug 13, 2024
7b22c1c
makegen
snissn Aug 13, 2024
59fe7e3
feat: f3: update go-f3 to 0.2.0 (#12390)
Kubuxu Aug 15, 2024
d415d9f
docs: update references to releases branch (#12396)
BigLep Aug 15, 2024
517c7ae
fix(ci): don't PR or changelog check for draft PRs (#12405)
rvagg Aug 19, 2024
0225c91
chore: post release steps for #12379 (v1.28.2 miner and node patch re…
rjan90 Aug 19, 2024
dbef5de
feat(libp2p): expose libp2p bandwidth metrics (#12402)
Stebalien Aug 19, 2024
a5cb674
fix: error check
qwdsds Aug 20, 2024
8518d23
build: update Lotus Node version to v1.29.1-dev in master (#12409)
rjan90 Aug 22, 2024
475139f
chore: deps: update to CGO-free go-crypto (#12411)
ribasushi Aug 23, 2024
4a4ddaa
docs: updates about branches and where to target PRs (#12416)
BigLep Aug 28, 2024
8f4299e
disable snappy compression in badger v4
snissn Aug 30, 2024
5ccbb3c
merge
snissn Aug 30, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs expansion. Maybe badger v4 should should be its own section? Maybe something like:

Experimental Badger v4 Support

With filecoin-project/lotus#12316, users can now opt-in to using Badger v4 instead of v2 for the datastore.

Why upgrade?

  • Performance improvement - Initial benchmarks showed 40% faster time to import a snapshot. (I'm making stuff up here...) I/O and CPU utiliation anecdotally reduced by 10% on the same workload (link to Grafana).
  • Active development - Badger v2 was released ~4 years ago, but v4 is where all the active development is by the community to improve disk read/write times and memory efficiency.

How to upgrade?
The v2 and v4 datastores are incompatible. Badger directories are directories, it's advised to first copy your v2 datastore. Then enable v4 with LOTUS_CHAINSTORE_BADGERVERSION=4. Download a recent mainnet snapshot (link to snapshot directory) to import using lotus command.

If you run into any problems please report them by opening an issue and you can also rollback with LOTUS_CHAINSTORE_BADGERVERSION=2 and copying back v2 directory.

Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
## New features

- feat: Add trace filter API supporting RPC method `trace_filter` ([filecoin-project/lotus#12123](https://github.com/filecoin-project/lotus/pull/12123)). Configuring `EthTraceFilterMaxResults` sets a limit on how many results are returned in any individual `trace_filter` RPC API call.
- feat: Add support for Badger version 4. The default remains Badger version 2 to ensure backward compatibility. ([filecoin-project/lotus#12316](https://github.com/filecoin-project/lotus/pull/12316)). Configuring `LOTUS_CHAINSTORE_BADGERVERSION` can configure lotus to use badger version 2 and 4.
- feat: `FilecoinAddressToEthAddress` RPC can now return ETH addresses for all Filecoin address types ("f0"/"f1"/"f2"/"f3") based on client's re-org tolerance. This is a breaking change if you are using the API via the go-jsonrpc library or by using Lotus as a library, but is a non-breaking change when using the API via any other RPC method as it adds an optional second argument.
([filecoin-project/lotus#12324](https://github.com/filecoin-project/lotus/pull/12324)).
- feat: Added `lotus-shed indexes inspect-events` health-check command ([filecoin-project/lotus#12346](https://github.com/filecoin-project/lotus/pull/12346)).
Expand Down
169 changes: 40 additions & 129 deletions blockstore/badger/blockstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,23 @@ import (
"sync"
"time"

"github.com/dgraph-io/badger/v2"
"github.com/dgraph-io/badger/v2/options"
badgerstruct "github.com/dgraph-io/badger/v2/pb"
blocks "github.com/ipfs/go-block-format"
"github.com/ipfs/go-cid"
ipld "github.com/ipfs/go-ipld-format"
logger "github.com/ipfs/go-log/v2"
pool "github.com/libp2p/go-buffer-pool"
"github.com/multiformats/go-base32"
"go.uber.org/multierr"
"go.uber.org/zap"
"golang.org/x/xerrors"

"github.com/filecoin-project/lotus/blockstore"
"github.com/filecoin-project/lotus/blockstore/badger/versions"
badger "github.com/filecoin-project/lotus/blockstore/badger/versions"
)

// aliases to mask badger dependencies.
const (
defaultGCThreshold = 0.125
)

var (
Expand All @@ -39,46 +42,6 @@ var (
log = logger.Logger("badgerbs")
)

// aliases to mask badger dependencies.
const (
// FileIO is equivalent to badger/options.FileIO.
FileIO = options.FileIO
// MemoryMap is equivalent to badger/options.MemoryMap.
MemoryMap = options.MemoryMap
// LoadToRAM is equivalent to badger/options.LoadToRAM.
LoadToRAM = options.LoadToRAM
defaultGCThreshold = 0.125
)

// Options embeds the badger options themselves, and augments them with
// blockstore-specific options.
type Options struct {
badger.Options

// Prefix is an optional prefix to prepend to keys. Default: "".
Prefix string
}

func DefaultOptions(path string) Options {
return Options{
Options: badger.DefaultOptions(path),
Prefix: "",
}
}

// badgerLogger is a local wrapper for go-log to make the interface
// compatible with badger.Logger (namely, aliasing Warnf to Warningf)
type badgerLogger struct {
*zap.SugaredLogger // skips 1 caller to get useful line info, skipping over badger.Options.

skip2 *zap.SugaredLogger // skips 2 callers, just like above + this logger.
}

// Warningf is required by the badger logger APIs.
func (b *badgerLogger) Warningf(format string, args ...interface{}) {
b.skip2.Warnf(format, args...)
}

// bsState is the current blockstore state
type bsState int

Expand Down Expand Up @@ -115,9 +78,9 @@ type Blockstore struct {
moveState bsMoveState
rlock int

db *badger.DB
dbNext *badger.DB // when moving
opts Options
db badger.BadgerDB
dbNext badger.BadgerDB // when moving
opts badger.Options

prefixing bool
prefix []byte
Expand All @@ -132,13 +95,9 @@ var _ blockstore.BlockstoreSize = (*Blockstore)(nil)
var _ io.Closer = (*Blockstore)(nil)

// Open creates a new badger-backed blockstore, with the supplied options.
func Open(opts Options) (*Blockstore, error) {
opts.Logger = &badgerLogger{
SugaredLogger: log.Desugar().WithOptions(zap.AddCallerSkip(1)).Sugar(),
skip2: log.Desugar().WithOptions(zap.AddCallerSkip(2)).Sugar(),
}
func Open(opts badger.Options) (*Blockstore, error) {

db, err := badger.Open(opts.Options)
db, err := badger.OpenBadgerDB(opts)
if err != nil {
return nil, fmt.Errorf("failed to open badger blockstore: %w", err)
}
Expand Down Expand Up @@ -315,10 +274,10 @@ func (b *Blockstore) movingGC(ctx context.Context) error {
log.Infof("moving blockstore from %s to %s", b.opts.Dir, newPath)

opts := b.opts
opts.Dir = newPath
opts.ValueDir = newPath
opts.SetDir(newPath)
opts.SetValueDir(newPath)

dbNew, err := badger.Open(opts.Options)
dbNew, err := badger.OpenBadgerDB(opts)
if err != nil {
return fmt.Errorf("failed to open badger blockstore in %s: %w", newPath, err)
}
Expand Down Expand Up @@ -391,65 +350,8 @@ func symlink(path, linkTo string) error {
}

// doCopy copies a badger blockstore to another
func (b *Blockstore) doCopy(ctx context.Context, from, to *badger.DB) (defErr error) {
batch := to.NewWriteBatch()
defer func() {
if defErr == nil {
defErr = batch.Flush()
}
if defErr != nil {
batch.Cancel()
}
}()

return iterateBadger(ctx, from, func(kvs []*badgerstruct.KV) error {
// check whether context is closed on every kv group
if err := ctx.Err(); err != nil {
return err
}
for _, kv := range kvs {
if err := batch.Set(kv.Key, kv.Value); err != nil {
return err
}
}
return nil
})
}

var IterateLSMWorkers int // defaults to between( 2, 8, runtime.NumCPU/2 )

func iterateBadger(ctx context.Context, db *badger.DB, iter func([]*badgerstruct.KV) error) error {
workers := IterateLSMWorkers
if workers == 0 {
workers = between(2, 8, runtime.NumCPU()/2)
}

stream := db.NewStream()
stream.NumGo = workers
stream.LogPrefix = "iterateBadgerKVs"
stream.Send = func(kvl *badgerstruct.KVList) error {
kvs := make([]*badgerstruct.KV, 0, len(kvl.Kv))
for _, kv := range kvl.Kv {
if kv.Key != nil && kv.Value != nil {
kvs = append(kvs, kv)
}
}
if len(kvs) == 0 {
return nil
}
return iter(kvs)
}
return stream.Orchestrate(ctx)
}

func between(min, max, val int) int {
if val > max {
val = max
}
if val < min {
val = min
}
return val
func (b *Blockstore) doCopy(ctx context.Context, from versions.BadgerDB, to versions.BadgerDB) error {
return from.Copy(ctx, to)
}

func (b *Blockstore) deleteDB(path string) {
Expand Down Expand Up @@ -505,7 +407,7 @@ func (b *Blockstore) onlineGC(ctx context.Context, threshold float64, checkFreq
}
}

if err == badger.ErrNoRewrite {
if err == b.db.GetErrNoRewrite() {
// not really an error in this case, it signals the end of GC
return nil
}
Expand Down Expand Up @@ -578,7 +480,7 @@ func (b *Blockstore) GCOnce(ctx context.Context, opts ...blockstore.BlockstoreGC

// Note no compaction needed before single GC as we will hit at most one vlog anyway
err := b.db.RunValueLogGC(threshold)
if err == badger.ErrNoRewrite {
if err == b.db.GetErrNoRewrite() {
// not really an error in this case, it signals the end of GC
return nil
}
Expand Down Expand Up @@ -636,11 +538,14 @@ func (b *Blockstore) View(ctx context.Context, cid cid.Cid, fn func([]byte) erro
defer KeyPool.Put(k)
}

return b.db.View(func(txn *badger.Txn) error {
return b.db.View(func(txn badger.Txn) error {

errKeyNotFound := b.db.GetErrKeyNotFound()

switch item, err := txn.Get(k); err {
case nil:
return item.Value(fn)
case badger.ErrKeyNotFound:
case errKeyNotFound:
return ipld.ErrNotFound{Cid: cid}
default:
return fmt.Errorf("failed to view block from badger blockstore: %w", err)
Expand Down Expand Up @@ -683,13 +588,14 @@ func (b *Blockstore) Has(ctx context.Context, cid cid.Cid) (bool, error) {
defer KeyPool.Put(k)
}

err := b.db.View(func(txn *badger.Txn) error {
err := b.db.View(func(txn badger.Txn) error {
_, err := txn.Get(k)
return err
})

errKeyNotFound := b.db.GetErrKeyNotFound()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't great. Other options:

  1. Re-write errors inside these functions. We'd need to be careful to make sure we don't miss any cases...
  2. Horrible but... we can badger2.ErrKeyNotFound = badger4.ErrKeyNotFound. Really, it's not that horrible (unless someone is stashing these errors early and using those stashed version for comparison).
  3. Change the APIs so we don't need to check against special errors?
  4. Provide IsKeyNotFound etc. helpers that abstract over different badger versions? Also kind of nasty.

I can't think of a great solution, tbh.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into badger2.ErrKeyNotFound = badger4.ErrKeyNotFound and the current implementation feels like the most maintainable solution option to me

switch err {
case badger.ErrKeyNotFound:
case errKeyNotFound:
return false, nil
case nil:
return true, nil
Expand Down Expand Up @@ -718,12 +624,13 @@ func (b *Blockstore) Get(ctx context.Context, cid cid.Cid) (blocks.Block, error)
}

var val []byte
err := b.db.View(func(txn *badger.Txn) error {
err := b.db.View(func(txn badger.Txn) error {
errKeyNotFound := b.db.GetErrKeyNotFound()
switch item, err := txn.Get(k); err {
case nil:
val, err = item.ValueCopy(nil)
return err
case badger.ErrKeyNotFound:
case errKeyNotFound:
return ipld.ErrNotFound{Cid: cid}
default:
return fmt.Errorf("failed to get block from badger blockstore: %w", err)
Expand Down Expand Up @@ -751,11 +658,12 @@ func (b *Blockstore) GetSize(ctx context.Context, cid cid.Cid) (int, error) {
}

var size int
err := b.db.View(func(txn *badger.Txn) error {
err := b.db.View(func(txn badger.Txn) error {
errKeyNotFound := b.db.GetErrKeyNotFound()
switch item, err := txn.Get(k); err {
case nil:
size = int(item.ValueSize())
case badger.ErrKeyNotFound:
case errKeyNotFound:
return ipld.ErrNotFound{Cid: cid}
default:
return fmt.Errorf("failed to get block size from badger blockstore: %w", err)
Expand Down Expand Up @@ -805,10 +713,13 @@ func (b *Blockstore) PutMany(ctx context.Context, blocks []blocks.Block) error {
keys = append(keys, k)
}

err := b.db.View(func(txn *badger.Txn) error {
err := b.db.View(func(txn badger.Txn) error {

errKeyNotFound := b.db.GetErrKeyNotFound()

for i, k := range keys {
switch _, err := txn.Get(k); err {
case badger.ErrKeyNotFound:
case errKeyNotFound:
case nil:
keys[i] = nil
default:
Expand All @@ -822,7 +733,7 @@ func (b *Blockstore) PutMany(ctx context.Context, blocks []blocks.Block) error {
return err
}

put := func(db *badger.DB) error {
put := func(db badger.BadgerDB) error {
batch := db.NewWriteBatch()
defer batch.Cancel()

Expand Down Expand Up @@ -1070,6 +981,6 @@ func (b *Blockstore) StorageKey(dst []byte, cid cid.Cid) []byte {

// DB is added for lotus-shed needs
// WARNING: THIS IS COMPLETELY UNSAFE; DONT USE THIS IN PRODUCTION CODE
func (b *Blockstore) DB() *badger.DB {
func (b *Blockstore) DB() badger.BadgerDB {
return b.db
}
Loading
Loading