Filter w 5PLFOΛҰఆͷϧʔϧͰআڈ͢ΔʢFHStopFilterʣ w 5PLFOͷจࣈྻΛҰఆͷϧʔϧͰஔ͢ΔʢFHLowerCaseFilterʣ w AnalyzerͷྫStandardAnalyzer w StandardAnalyzerStandardTokenizer + StopFilter LowerCaseFilter w 6OJDPEF5FYU4FHNFOUBUJPO ϕʔεͷ Tokenizer "Lucene in Action" "Lucene", "in", "Action" "Lucene", "Action" "lucene", "action" StandardTokenizer StopFilter LowerCaseFilter *GUIFStopFilterIBTlJOzBTBTUPQXPSE
Path indexDirPath = Files.createDirectory(Path.of("index")); Directory directory = FSDirectory.open(indexDirPath); // Set up IndexWriter Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig config = new IndexWriterConfig(analyzer); IndexWriter indexWriter = new IndexWriter(directory, config); // Index a document: "Lucene in Action" Document doc1 = new Document(); doc1.add(new Field("title", "Lucene in Action", TextField.TYPE_STORED)); indexWriter.addDocument(doc1); // Index a document: "Lucene Cookbook" Document doc2 = new Document(); doc2.add(new Field("title", "Lucene Cookbook", TextField.TYPE_STORED)); indexWriter.addDocument(doc2); // Write index to the directory indexWriter.close();
_0_Lucene84_0.tip _0_Lucene84_0.tmd segments_1 write.lock *OEFYBGUFSTUDPNNJU 4FHNFOU 4FHNFOUT'JMF -PDL'JMF TFHNFOUT@ 4FHNFOU w ΠϯσοΫεෳͷηάϝϯτ TFHNFOU ͔ΒͳΔ w ͯ͢ಉ͡σΟϨΫτϦʹอଘ͞ΕΔ w ηάϝϯταϒΠϯσοΫε w ୯ମͰ΄΅-VDFOFΠϯσοΫεͱͯ͠ػೳ͢Δ w ηάϝϯτෳͷϑΝΠϧ͔ΒͳΔ w ϑΝΠϧ໊_gen.extPS_gen_Lucene84_0.extͷܗࣜ w &H_0.fnm _0_Lucene84_0.pos ʜ w genηάϝϯτͷੈ FH w extϑΥʔϚοτ͝ͱͷ֦ுࢠ FHGON QPT w IndexWriter͕ fl VTIͨ͠ͱ͖ʹηάϝϯτ͕ͭ࡞ΒΕΔ w DPNNJU͞Εͨͱ͖ʹॳΊͯsegments_N͔Βࢀর͞ΕΔ w N ʜ *OEFY4FHNFOUT -VDFOFJOEFY fi MFT
ͯϩʔυ͢ΔͱϝϞϦ͕͍͘Β ͋ͬͯΓͳ͍ IUUQPQFOTFBSDIMBCPUBHPBDO[QBQFS@QEG Białecki, Andrzej, et al. "Apache lucene 4." SIGIR 2012 workshop on open source information retrieval. 2012. )PX-VDFOFVTFTJOEFY fi MFT -VDFOF"SDIJUFDUVSF
w FHLucene87Codec Lucene90Codec w நΫϥεCodecΛܧঝ w $PEFD·ΘΓͷ࣮جຊ org.apache.lucene.codecsύοέʔδͷ Լʹ·ͱΊΒΕ͍ͯΔ w όʔδϣϯ͝ͱʹύοέʔδ͕͔Ε͍ͯΔ w lucene90ͷ߹-VDFOFͰ৽ͨʹఆٛ͞Εͨ$PEFD w ԼҐޓͷ$PEFDorg.apache.lucene.backward_codecs $PEFD"1* QVCMJDDMBTT-VDFOF$PEFDFYUFOET$PEFD\ ʜ QSJWBUF fi OBM5FSN7FDUPST'PSNBUWFDUPST'PSNBUOFX-VDFOF5FSN7FDUPST'PSNBU QSJWBUF fi OBM'JFME*OGPT'PSNBU fi FME*OGPT'PSNBUOFX-VDFOF'JFME*OGPT'PSNBU QSJWBUF fi OBM4FHNFOU*OGP'PSNBUTFHNFOU*OGPT'PSNBUOFX-VDFOF4FHNFOU*OGP'PSNBU QSJWBUF fi OBM-JWF%PDT'PSNBUMJWF%PDT'PSNBUOFX-VDFOF-JWF%PDT'PSNBU QSJWBUF fi OBM$PNQPVOE'PSNBUDPNQPVOE'PSNBUOFX-VDFOF$PNQPVOE'PSNBU QSJWBUF fi OBM1PJOUT'PSNBUQPJOUT'PSNBUOFX-VDFOF1PJOUT'PSNBU !0WFSSJEF QVCMJD5FSN7FDUPST'PSNBUUFSN7FDUPST'PSNBU \SFUVSOWFDUPST'PSNBU^ !0WFSSJEF QVCMJD fi OBM'JFME*OGPT'PSNBU fi FME*OGPT'PSNBU \SFUVSO fi FME*OGPT'PSNBU^ !0WFSSJEF QVCMJD4FHNFOU*OGP'PSNBUTFHNFOU*OGP'PSNBU \SFUVSOTFHNFOU*OGPT'PSNBU^ ʜ ^ org.apache.lucene.backward_codecs.lucene87.Lucene87Codec ΑΓൈਮ
for (String value : set) { writeString(value); } } w DataOutput#writeSetOfStrings( Set<String> set) w ࠷ॳʹཁૉ͕ writeVInt() Ͱॻ ͔ΕΔ w ͦͷ͋ͱɺ͕ཁૉ͝ͱʹ writeString() Ͱॻ͔ΕΔ ʢཁૉ͚ͩ܁Γฦ͠ʣ