Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions docs/compactionStrategy.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ Accumulo. Place your compaction configuration into a file and run the tool. For

The following command would check the configuration for errors:

$ accumulo check-compaction-config /path/to/myconfig
$ accumulo conf check-compaction-config /path/to/myconfig


With the compaction configuration set, configure table specific properties.
Expand All @@ -97,18 +97,18 @@ If needed, `chop` compactions can be configured also.
Generate some data and files in order to test the strategy:

$ ./bin/runex client.SequentialBatchWriter -t examples.test1 --start 0 --num 1000 --size 50
$ accumulo shell -u <username> -p <password> -e "flush -t examples.test1"
$ accumulo shell --user root --password secret -e "flush -t examples.test1"

$ ./bin/runex client.SequentialBatchWriter -t examples.test1 --start 0 --num 2000 --size 50
$ accumulo shell -u <username> -p <password> -e "flush -t examples.test1"
$ accumulo shell --user root --password secret -e "flush -t examples.test1"

$ accumulo shell -u <username> -p <password> -e "compact -t examples.test1 -w"
$ accumulo shell --user root --password secret -e "compact -t examples.test1 -w"

View the `tserver` log in <accumulo_home>/logs for the compaction and find the name of the `rfile` that was
compacted for your table. Print info about this file using the `rfile-info` tool. Replace the TableID with
the TableID from above. Note, your filenames will differ from the ones within this example.

accumulo rfile-info hdfs:///accumulo/tables/2/default_tablet/A000000a.rf
$ accumulo file rfile-info hdfs:///accumulo/tables/2/default_tablet/A000000a.rf

Details about the rfile will be printed. The compression type should match the type used in the compaction.
In this case, `snappy` is used since the size is less than 100M.
Expand All @@ -123,20 +123,20 @@ Meta block : RFile.index
Continue to add additional data.

$ ./bin/runex client.SequentialBatchWriter -t examples.test1 --start 0 --num 1000000 --size 50
$ accumulo shell -u <username> -p <password> -e "flush -t examples.test1"
$ accumulo shell --user root --password secret -e "flush -t examples.test1"

$ ./bin/runex client.SequentialBatchWriter -t examples.test1 --start 1000000 --num 1000000 --size 50
$ accumulo shell -u <username> -p <password> -e "flush -t examples.test1"
$ accumulo shell --user root --password secret -e "flush -t examples.test1"

$ ./bin/runex client.SequentialBatchWriter -t examples.test1 --start 2000000 --num 1000000 --size 50
$ accumulo shell -u <username> -p <password> -e "flush -t examples.test1"
$ accumulo shell --user root --password secret -e "flush -t examples.test1"

$ accumulo shell -u <username> -p <password> -e "compact -t examples.test1 -w"
$ accumulo shell --user root --password secret -e "compact -t examples.test1 -w"

Again, view the tserver log in <accumulo_home>/logs for the compaction and find the name of the `rfile` that was
compacted for your table. Print info about this file using the `rfile-info` tool:

accumulo rfile-info hdfs:///accumulo/tables/2/default_tablet/A000000o.rf
$ accumulo file rfile-info hdfs:///accumulo/tables/2/default_tablet/A000000o.rf

In this case, the compression type should be `gz`.

Expand Down
Loading