Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Apache Cassandraの現在地

Apache Cassandraの現在地

Avatar for DATASTAX JAPAN

DATASTAX JAPAN

June 29, 2023
Tweet

More Decks by DATASTAX JAPAN

Other Decks in Technology

Transcript

  1. Apache Cassandra History
 3.0
 開発ブランチ 3.0.0 release
 2015年11月10 日 3.11.0

    release
 2017年6月24日
 4.0.0
 3.11.0 4.0.0 release
 2021年7月30日
 3.11.15 4.1.0
 3.0.29 release 2022年5月15日
 
 3.11.15 release
 2023年5月5日
 3.0.29 4.0.9
 4.0.5 release
 2023年4月14日
 4.1.0 release
 2022年5月30日
 5.0
 5.0.0 release
 2023年12月??
 :5.0.0リリース迄サポー ト
 3.0.x 3.11. x
 4.0.x :5.1.0リリース迄サポー ト
 4.1.x :5.2.0リリース迄サポー ト
 Apache Cassandraは Semantic Versioning 2.0.0を採用しリリースを 行う 4.1.2
 ※5.0系は3系からの直接Upgradeのパスなし。

  2. Cassandra Enhancement Proposal 
 Cassandra Enhancement Proposals は、Cassandra での新機 能開発の提案、議論、承認のプロセスを提供します。

    https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=95652201 CEP-3: Guardrails
 CEP-7: Storage Attached Index
 CEP-9: Make SSLContext creation pluggable
 CEP-10: Cluster and Code Simulation
 CEP-11: Pluggable memtable implementations
 CEP-14: Paxos Improvements
 CEP-15: General Purpose Transactions
 CEP-13: Denylisting partitions
 CEP-16: Auth Plugin Support for CQLSH
 CEP-17: SSTable format API
 CEP-19: Trie memtable implementations
 CEP-20: Dynamic Data Masking
 CEP-21: Transactional Cluster Metadata
 CEP-25: Trie Indexed SSTable
 CEP-26: Unified Compaction Strategy
 Adopted CEPs CEP-1: Apache Cassandra Management Process(es)
 CEP-2: Kubernetes Operator
 CEP-12: Diagnostic Events in virtual tables
 CEP-24: Password validation and generation
 CEP-28: Reading and Writing Cassandra Data with Spark Bulk Analytics
 CEPs under discussion CEP-4: EXPLAIN
 CEP-5: JOINs (copy of gdoc w/permissions changed)
 CEP-6: Change Data Capture v2
 CEP-8: Drivers Donation
 CEP-27: Generic API for Internal Data Collections Exposure
 CEPs in draft

  3. ガードレールフレームワーク(CEP-3)
 運用において問題が発生しそうな構成をあらかじめ制限する機能 • 特定機能の無効化
 • ALLOW FILTERINGやSecondary Indexの使用停止など
 • 特定値の使用を制限


    • 配列型での要素数制限やIN句における要素数の制限など • 特定データベースのサイズに対するソフト・ハード制限の設定
 • データディスク使用量の制限やテーブルの要素制限 

  4. Apache Cassandra アップブレードポイント ▪SSTableのバージョン
 Version 0 •b (0.7.0): added version

    to sstable filenames •c (0.7.0): bloom filter component computes hashes over raw key bytes instead of strings •d (0.7.0): row size in data component becomes a long instead of int •e (0.7.0): stores undecorated keys in data and index components •f (0.7.0): switched bloom filter implementations in data component •g (0.8): tracks flushed-at context in metadata component Version 1 •h (1.0): tracks max client timestamp in metadata component •hb (1.0.3): records compression ration in metadata component •hc (1.0.4): records partitioner in metadata component •hd (1.0.10): includes row tombstones in maxtimestamp
 •he (1.1.3): includes ancestors generation in metadata component •hf (1.1.6): marker that replay position corresponds to 1.1.5+ millis-based id (see CASSANDRA-4782) •ia (1.2.0): • column indexes are promoted to the index file • records estimated histogram of deletion times in tombstones • bloom filter (keys and columns) upgraded to Murmur3 •ib (1.2.1): tracks min client timestamp in metadata component •ic (1.2.5): omits per-row bloom filter of column names
  5. Apache Cassandra アップブレードポイント ▪SSTableのバージョン
 Version 2 •ja (2.0.0): • super

    columns are serialized as composites (note that there is no real format change, this is mostly a marker to know if we should expect super columns or not. We do need a major version bump however, because we should not allow streaming of super columns into this new format) • tracks max local deletiontime in sstable metadata • records bloom_filter_fp_chance in metadata component • remove data size and column count from data file (CASSANDRA-4180) • tracks max/min column values (according to comparator) •jb (2.0.1): • switch from crc32 to adler32 for compression checksums • checksum the compressed data •ka (2.1.0): • new Statistics.db file format • index summaries can be downsampled and the sampling level is persisted • switch uncompressed checksums to adler32 • tracks presense of legacy (local and remote) counter shards •la (2.2.0): new file name format •lb (2.2.7): commit log lower bound included Version 3 •ma (3.0.0): • swap bf hash order • store rows natively •mb (3.0.7, 3.7): commit log lower bound included •mc (3.0.8, 3.9): commit log intervals included •md (3.0.18, 3.11.4): corrected sstable min/max clustering •me (3.0.25, 3.11.11): added hostId of the node from which the sstable originated Version 4 •na (4.0-rc1): uncompressed chunks, pending repair session, isTransient, checksummed sstable metadata file, new Bloomfilter format •nb (4.0.0): originating host id
  6. Apache Cassandra アップブレードポイント ▪ネットワークプロトコル
 ノード間通信を行う際のプロトコルバージョン VERSION_07 = 1; VERSION_080 =

    2; VERSION_10 = 3; VERSION_11 = 4; VERSION_117 = 5; VERSION_12 = 6; VERSION_20 = 7; VERSION_21 = 8; VERSION_22 = 9; VERSION_30 = 10; VERSION_3014 = 11;
 VERSION_40 = バージョンが異なって も必ず疎通不可になる わけでは無い。