usable for data science 2. For the future What we should effort to keep Ruby avail- able for data science 3. Request for you Shall we develop our tools and community? 8 / 55
costs in data exchange by JSON API Development and maintenance of API endpoints JSON serialization for exchanging data Letting data processing systems refer the same database of the main application It increases the development cost of the main application 14 / 55
use a Python interpreter to- gether with a Ruby interpreter in the same process. PyCall provides low-cost ways of data exchanging. Directly data conversion to Python data types. Sharing the same memory pointers. Use Apache Arrow data structure by red-ar- row-pycall library 15 / 55
results and collect them in a pandas dataframe Visualize the results by using seaborn that is Python visualization library built on matplotlib Perform all the above things in one Ruby script 18 / 55
system has its own internal memory format Serialize and deserialize for exchanging data wasted a lot of CPU time Similar functions are implemented in multiple systems 42 / 55
became a mem- ber of PMC (project management committie) of Apache Arrow yesterday This means there is at least one person who de- velops Ruby-support of Apache Arrow as a core developer So you will be able to use Apache Arrow's new feature ASAP 50 / 55
is usable in data science You can use Python tools from Ruby by using PyCall as demonstrations I performed in this talk Red Data Tools enables us to use Apache Arrow and it guarantees that Ruby will be connected to multiple data processing systems in the future But there are lots of things should be done for the future 52 / 55