Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Processando textos enormes com ferramentas "Unix"
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Luiz Menezes
February 11, 2017
0
64
Processando textos enormes com ferramentas "Unix"
Palestra dada no evento "Linux em Prosa" do sancaLUG.
Luiz Menezes
February 11, 2017
Tweet
Share
More Decks by Luiz Menezes
See All by Luiz Menezes
async é bom, async eu gosto
luizmenezes
0
51
Testando aplicações web com py.test e selenium
luizmenezes
0
140
Python 3 Orientado a Objetos
luizmenezes
3
310
Expondo o Raspberry Pi via servidor web
luizmenezes
0
95
Workshop Django
luizmenezes
2
260
Bottle Admin
luizmenezes
0
110
IoT, Raspberry Pi e Python
luizmenezes
1
81
Coding Dojo
luizmenezes
0
63
Featured
See All Featured
Rails Girls Zürich Keynote
gr2m
96
14k
Mind Mapping
helmedeiros
PRO
1
99
The Curious Case for Waylosing
cassininazir
0
250
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
190
Color Theory Basics | Prateek | Gurzu
gurzu
0
210
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
210
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
100
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
290
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
110
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
79
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1k
Large-scale JavaScript Application Architecture
addyosmani
515
110k
Transcript
processando textos enormes com ferramentas "unix"
oi! eu sou o luiz me encontre em @luiz_amf github.com/lamenezes
abrir/ler e processar textões arquivos de texto com mais de
8 mil 10 milhões de linhas (>4 GB) o problema
None
“ Write programs that do one thing and do it
well Write programs to work together. Write programs to handle text streams, because that is a universal interface. "Unix Philosophy" por Peter Salus
fazer uma coisa e fazer bem a caixa de ferramentas
cat concatenate files and print ▷ visualizar arquivos texto ▷
exemplo $ cat foo.txt
less ▷ visualizar arquivos texto ▷ permite navegação ▷ lê
arquivo enquanto executa ▷ exemplo $ less foo.txt
cp copy ▷ sempre tenha um backup de seus dados
▷ exemplo $ cp foo.txt backup.txt
head & tail ▷ imprime X linhas do arquivo ▷
dividir e conquistar ▷ exemplos $ head foo.txt -n 20 $ tail foo.txt -n 50
▷ qual o tamanho da bronca? ▷ calcula do arquivo
◦ linhas ◦ caracteres/bytes ◦ palavras ▷ exemplo $ wc foo.txt wc word count
▷ remove partes de cada linha de um arquivo ▷
exemplo $ cut -f2,3-5 foo.txt cut
▷ editor "completo" ◦ substituição/remoção de caracteres ◦ duplica linhas
◦ remoção de linhas ◦ busca ▷ exemplo $ sed 's/foo/bar/' foo.txt sed stream editor
▷ busca ▷ exemplo $ grep agulha palheiro.txt grep global
search a regular expression and print
trabalhar bem em conjunto a interface universal de stream de
textos
pipes ▷ encadeamento de comandos
cat <arquivo> | tr <de> <para> ▷ traduz caracteres ▷
deleta ▷ "aperta" tr translate or delete characters
▷ busca com filtros múltiplos cat random.csv | grep AC
| grep "Sr\." cat random.csv | grep João | grep AL ▷ busca + remoção de palavras cat random.csv | grep AC | sed "s/Dr. //" cat random.csv | grep GO | sed "s/Sr. //" pipes
▷ visualizar consumo de memória dos programas ps aux |
sed "s/ \+/\t/g" | cut -f 4,11- | less pipes
obrigado! @luiz_amf github.com/lamenezes
Credits Special thanks to all the people who made and
released these awesome resources for free: ▷ Presentation template by SlidesCarnival