Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[LiboCon 2023] LibreOffice's current status and...

DaeHyun Sung
September 23, 2023

[LiboCon 2023] LibreOffice's current status and community in South Korea

[LiboCon 2023 Day3] LibreOffice's current status and community in South Korea
Link: https://events.documentfoundation.org/libreoffice-conference-2023/talk/CK3YPP/
LibreOffice's current status and community in South Korea.
Currently, I'm a full-time worker(Developer) & University student in Korea.
I'll explain South Korea's document format war's winner 'hwpx', the HWP Issue.
In South Korea, Both OWPML(hwpx) and ODF are Document format standards in the public sector.
The Republic of Korea government adopted OpenDocument Format as a part of Korean Industrial Standards KS X ISO/IEC 26300 in 2007. Also, IT adopted OWPML(hwpx) format as a part of Korean Industrial Standards KS X 6101 in 2011.
But, In South Korea's academic field, the public & private sector, It is ?widely using HWP as a document standard. They are still made and distributed in .hwp format. But, Since 2021, the South Korea government has started to choose 'hwpx' instead of 'hwp' format.
This year, Many of the South Korea government's ministries and state-own agencies widely use 'hwpx'.
So, I briefly talk about 'hwp' and 'hwpx' file format structures

Also, I talk about several Korean & CJK issues in LibreOffice, LibreOffice Korean Local Community's activities, my some Korean and CJK contributions to LibreOffice.
I'll also share my challenges and future plans for both LibreOffice and ODF formats in Korea.

DaeHyun Sung

September 23, 2023
Tweet

More Decks by DaeHyun Sung

Other Decks in Programming

Transcript

  1. LibreOffice's current status and community in South Korea DaeHyun Sung(

    성대현 ) sungdh86+git at gmail dot com 2023-09-23 | Saturday, September 23, 2023 | 2023 년 9 월 23 일 일요일 LibreOffice Conference 2023, Bucharest, Romania
  2. Who am I?  DaeHyun Sung ( 성대현 , 成大鉉

    , ソン・デヒョン )  One of Korean Open Source Contributors  GNOME Foundation Member  Founder member of LibreOffice Korean Team(2017-)  Member of The Document Foundation (2019-)
  3. Who am I?  Free/Libre Open Source Enthusiast in Korea

     Activities: Korean Community management & QA, improve some Korean features on LibreOffice  Father of a child & husband of a family  Now, I joined Korea startup company “Lablup”, DevOps Engineer since 2022
  4. Where am I from? Now, Live in 안양 ( 安養

    ) Anyang Hometown 경상북도 ( 慶尙北道 ) North Gyeongsang Province Studied and lived 서울 Seoul
  5. LibreOffice Conference 2023  This is my first time visiting

    Europe  My first offline Conference in Europe  As I knew Romania & Romanian Language  Dracula  Meme song, O-Zone’s “Dragostea Din Tei”
  6. LibreOffice status in Korea ▪ In South Korea, LibreOffice users

    have few ▪ I report almost Korean bugs…(Community has not been activated) ▪ In the Office market, MS Office 70% : HWP: 30% ▪ Hancom( 한글과 컴퓨터 )'s HWP has an exclusive position in the public sector’s office market. ▪ The reason is that under Korea law, Public sectors have to purchase only Korean company's software unless there is anything special
  7. Document Formats in Korea  HWP – Hangul Word Processor

    (published by Hancom 한글과컴퓨터 , brand name: ᄒᆞᆫ글 )  In South Korea, Both OWPML and ODF are a Document format standard as a public sector.  The Republic of Korea government adopted OpenDocument as a part of Korean Industrial Standards KS X ISO/IEC 26300 in 2007  KS X 6101(OWPML) – ROK government guided xml based document specification → Hancom made hwpml and HWPX formats, based on government proposals  However, Hancom’s Binary File format(HWP) broadly use in Korea
  8. Document Formats in Korea  KS X ISO/IEC 26300 –

    set 2007, Global Standard, ODF Korean translation version  https://www.standard.go.kr/KSCI/standardIntro/getStandardSearchView.do ?menuId=919&topMenuId=502&upperMenuId=503&ksNo=KSXISOIEC26300& tmprKsNo=KSXISOIEC26300&reformNo=03&displayBlock=none&displayBloc k2=block  KS X 6101(OWPML) – set 2011 – only Korean Standard  https://www.standard.go.kr/KSCI/standardIntro/getStandardSearchView.do ?menuId=503&topMenuId=502&ksNo=KSX6101&tmprKsNo=KSX6101&refor mNo=01
  9. HWP 2022  Recommend using HWPX format instead of HWP

     Hancom notices for recommending the use of hwpx since 2021.04.15 https://www.hancom.com/board/noticeView.do?board_seq=3&artcl_seq=10924&pag eInfo.page=&search_text
  10. HWP Formats  HWP formats  HWPML(.hml) ← Hancom’s 1st

    xml-based document spec, released in 1997  HWP v3.0(.hwp) ← Supported by LibreOffice (Old binary format)  HWP v5.0(.hwp) ← Currently Popular Use in Korea, released in July 2010  OWPML (.owpml) ← based HWPML, government guided xml based document specification, released in 2010 [registered KS X 6101 in 2011]  HWPX(.hwpx) ← re-generated xml-based document spec by Hancom Recommend to use HWP program since 2021.04.15  https://www.hancom.com/board/noticeView.do?board_seq=3&artcl_seq=6453&pageIn fo.page&search_text
  11. HWP Formats  ClamAV 0.99.1: Hangul Word Processor (HWP) Document

    Support  https://blog.clamav.net/2016/03/clamav-0991-hangul-word-pro cessor-hwp.html
  12. HWP vs MS Word vs LibreOffice  HWP – Binary

    contents, Not easy read file format, non- interoperatability, proprietary format  MS Word(docx), LibreOffice(odt), hwpx – text format, Easily Readable format (XML-based)  docx  odt  hwpx
  13. HWP Issues for Expat in Korea  KDE aKademy 2018

    Keynote: Mapping Crimes Against Humanity in North Korea with FOSS https://conf.kde.org/en/akademy2018/public/events/78.html  Youtube: https://youtu.be/ITzFXeg4UGU?t=1590  He describes his North Korea human rights NGO( 전환기 정의 워킹 그룹 /Transitional Justice Working Group) working and using free and open source software and data(FOSSD)  He added that using FOSS in Korea in a challenge for foreigners  Korean Input, fonts, banking & online transactions  In addition, Hangul Word processor / .HWP files
  14. South Korea Gov’s OS Adoption  Gooroom OS: https://gooroom.kr/ 

    Debian based Linux distribution, led by 국가보안연구소 (NSR, National Security Research Institute) and 한글과컴퓨터 (Hancom)  Gooroom( 구름 ) literally means ‘Cloud’ in Korean  공공기관 60 만대 노트북 전면 교체 ... 구름 OS 탑재 ‘온북’으로 https://m.etnews.com/20220920000215  Translation: South Korea government replace 600,000 laptops, named ‘On Book( 온북 )’, with pre-loaded with Gooroom OS
  15. South Korea Gov’s OS Adoption  Gooroom OS-based Linux distributions

    release from various vendors  Hancom: Hancom Gooroom OS  https://www.hancom.com/product/productGooroomMain.do  Tmax: TmaxGooroom OS  https://www.tmax.co.kr/tmaxgooroom
  16. South Korea Gov’s OS Adoption  Critical issue of Hancom’s

    Gooroom Linux  Only support HWP Office Viewer  It’s not FLOSS software  Typical Proprietary software
  17. Barriers to ODF Use in Korea Law  Use hwp,

    gul, doc, xls, etc  However, There is no indication to use ODT file format https://www.law.go.kr/%ED%96%89%EC%A0%95% EA%B7%9C%EC%B9%99/%EC%A0%84%EC%9E %90%EB%AC%B8%EC%84%9C%EC%A0%9C%E C%B6%9C%ED%8C%8C%EC%9D%BC%EC%9D %98%ED%98%95%EC%8B%9D%EB%B0%8F%E C%9E%AC%EC%A0%84%EC%9E%90%ED%99% 94%EC%97%90%EA%B4%80%ED%95%9C%EA% B3%A0%EC%8B%9C/(2022-20,20220901)
  18. Barriers to ODF Use in Korea Law  South Korea

    Intellectual Property Office’s notice  However, There is no indication to use ODT file format https://www.law.go.kr/행정규칙/전자문서 제출파일의형식및재전자화에관한고시/(2020- 7,20200330)/제6조
  19. Example of MSIT (2021)  과학기술정보통신부 ( 科學技術情報通信部 ,Ministry of

    Science and ICT in Korea) adopted ODF format(Since 2020)  Media news] ZDNet Korea - 어떤 워드프로세서로든 열리는 'ODT'…" 국민 문서 활용도↑ " (‘ODT’ opened with any word processor, “Document Utilization↑”)  https://zdnet.co.kr/view/?no=20201116094217
  20. Example of MSIT (2021)  과학기술정보통신부 ( 科學技術情報通信部 ,Ministry of

    Science and ICT in Korea) adopted ODF format(Since 2020) – Press Releases( 보도자료 )
  21. Example of MSIT (2023)  과학기술정보통신부 ( 科學技術情報 通信部 ,Ministry

    of Science and ICT in Korea) – Press Releases( 보도자료 ) in 2023  Added hwp, hwpx viewer & viewer link button (made by Synapsoft)
  22. Example of MSIT (2023)  However, Only can download ODT

    on “Press Releases” link  If you want to edit and submit documents, you must use a proprietary software “HWP” that allows to open hwp, hwpx formats
  23. Migration failed history  Gyeonggi-do Province( 경기도 , 京畿道 )

    in Korea  Tried to Migration HWP to ODF → failed  2020: 경기도 , 공문서 'HWP' 로 안 쓴다 ..." 웹 표준화 추진 " (official document don’t use hwp, use odt,pdf instead hwp) https://zdnet.co.kr/view/?no=20201012101741  2021: 경기도 , 탈 HWP 선언하고 한컴 'HWPX' 쓰기로… " 방역 DB 관리 " (don’t use hwp, use HWPX) https://www.ajunews.com/view/20210521080522864
  24. Hancom’s changed policy  Last year(2022) Hancom site don’t sell

    “downloaded HWP software”  Need to registraion, forced payments, vendor lock-in  Link: 한컴닷컴 구매 서비스 종료 안내 (2022.09.27) Hancom software purchase service termination https://www.hancom.com/board/noticeView.do?artcl_seq=115 98
  25. Hancom’s EULA  Hancom release Office program & hwp viewer

     Hancom Office: proprietary license, buy it  Hancom hwp viewer: proprietary license  Personal use is free, corporate use is violated  If enterprise want to use viewer, it must be contracted  Not released Linux viewer version
  26. South Korea’s HWP alternative  South Korea government knows the

    problem of monopolizing hwp proprietary software for public sectors  Alternative to people who don’t buy or use hwp software  Hancom released “ 공공한글”  Site: https://www.hancom.com/cs_center/pubhwp.do  It can view and edit hwp file  However, as the software EULA, it can only personal purposed  It’s only working windows OS (only runs x86_64, not working arm64)
  27. South Korea’s HWP alternative  South Korea government knows the

    problem of monopolizing hwp proprietary software for public sectors  Alternative to people who don’t buy or use hwp software  Synapsoft’s office viewer  https://www.synapsoft.co.kr/viewer/  Public sector adopted office viewer solution for web service  It is only viewer, don’t edit document files  Also, It is not FLOSS software
  28. South Korea’s HWP alternative  HWP alternative office software 

    Synapsoft’s Synap Office https://www.synapsoft.co.kr/office/  Web-based office(for enterprise), proprietary software  Polaris Office https://www.polarisoffice.com/  Web-based office(subscription or for enterprise), Windows, Mac, iOS app  Many South Korea companies can read & write hwp file format  However, These only release it as proprietary software  In my opinion, open source ecosystem has not activated yet in Korea
  29. PyCon KR 2023  Day2 Sponsor program  Subject: Taming

    Llama  Main panel: Lablup’s CEO Jeongkyu Shin( 신정규 ) and staff, Upstage CEO & HKUST Prof. Sung Hun Kim( 김성훈 )  LLM(Large Language Model) and collecting Korean data  One person said to “How to analysis hwp format file data? In Korea, Many C-level use and write hwp format”  Checked the dement for hwp, hwpx file format analysis
  30. PyCon KR 2023  Day 2 Session: make hwp pypi

    package https://2023.pycon.kr/session/38  To generate HWP documents for office automation, you need to use the win32com API  It requires proprietary software both Windows and Hancom’s HWP  However, Hancom plans to discontinue OCX control maintanence after December 31, 2023  OCX 컨트롤 지원 종료 고지 (23/12/31~) | OCX Control Support End Notice https://forum.developer.hancom.com/t/ocx-23-12-31/765
  31. More Interoperability in Korea  Need to generate more Interoperable

    documents for Mac, Linux users & foreigners in Korea  Reference: Italo vignoli https://www.libreoffice.org/assets/Conference/LATAM-Conf/Estandares.pdf
  32. Section Conclusion  South Korea public sectors need to use

    ODF as a document standard, they want to replace the HWP format to ODF format (maybe cost-efficient and interoperability)  Some local companies can make views and editors that support doc, docx, hwp, hwpx, ODF formats, they only sell software as proprietary software and don’t contribute to open source project  Because, open-source ecosystem is rarely activated in Korea
  33. Section Conclusion  Also, public sectors oppose migrating hwp to

    other software & formats  As a result, Korea government requested a machine-readable file format instead of hwp format(binary based format), Hancom adapted hwpx file fomat(xml+zip based format) to current hwp software (Since 2021)
  34. My goal on LibreOffice  Make FLOSS ecosystem & culture

    in Korea  More Korean document translations, bug fixes needed  Need to work on increasing open source developers who can handle ODF  My life goal is Implement hwp viewing & editing and task automation function in LibreOffice
  35. Workings  Attended Conference  Ubucon Asia 2022 – promote

    LibreOffice  LibreOffice Kaigi 2023 Keynote speaker  UbuCon Korea 2023 – promote LibreOffice  Long Term Plan  Meetups  Fix bugs and Improve Korean features on LibreOffice
  36. Ubucon Asia 2022  Site: https://2022.ubucon.asia/  Date: November 26th

    ~ 27th, 2022  The first offline conference for LibreOffice Korean Team since the COVID-19 pandemic  I joined with Ubucon 2022 support  https://2022.ubucon.asia/sponsors/  Participates as a session speaker
  37. LibreOffice Kaigi 2023  LibreOffice Japanese Team’s annual meetup 

    I became a Keynote speaker  Link: https://wiki.documentfoundation.org/JA/Events/LibOKaigi/2023  Japanese link: https://libojapan.connpass.com/event/286688/  Title: My FLOSS contribution activities in Korean and CJK areas 韓国語とCJK分野での私のFLOSS貢献活動 한국어와 CJK 분야에서 저의 FLOSS 기여활동
  38. UbuCon Korea 2023  Promote LibreOffice on Ubucon Korea 2023

    at Microsoft Korea head office in Seoul, Korea  Photo link: https://discourse.ubuntu.com/t/ubucon-korea-2023-was-a-huge-success-with-151-check-ins/3851 1
  39. Fix default Korean font size  Change default Korean font

    size: 10.5pt to 10pt  https://bugs.documentfoundation.org/show_bug.cgi?id=155947  https://git.libreoffice.org/core/+/4ffa5f2d741368bcc70ec3fd5d5ca124 9cfc1e37%5E%21  In Korean law, It shows “The font size is 10pt, with the exception “( )” should be 9pt, with the following exception”
  40. Fix default Korean font size  Example of Korean law,

    the enforcement rules of "Regulations on Administrative Efficiency and Collaboration Promotion" [Korean name: " 행정 효율과 협업 촉진에 관한 규정 시행규칙 ") since 2011  https://law.go.kr/%EB%B2%95%EB%A0%B9%EB%B3%84%ED%91%9C%EC %84%9C%EC%8B%9D/(%ED%96%89%EC%A0%95%20%ED%9A%A8%EC% 9C%A8%EA%B3%BC%20%ED%98%91%EC%97%85%20%EC%B4%89%EC %A7%84%EC%97%90%20%EA%B4%80%ED%95%9C%20%EA%B7%9C%EC %A0%95%20%EC%8B%9C%ED%96%89%EA%B7%9C%EC%B9%99,202109 07,%EB%B3%84%ED%91%9C4)
  41. Fix default Korean font size  Why does Japan use

    default font size as 10.5pt?  pTEX and Japanese Typesetting - Haruhiko Okumura 奥村 晴彦 http://ajt.ktug.org/2008/0201okumura.pdf  For Japanese and Latin characters to mingle coordinately, the height plus depth of the Latin font (i.e., 1 em) should be somewhat larger than that of the Japanese font (1 zw). The 10-point js document classes use 10 pt (about 3.5146 mm; 1 pt = 1/72.27 in for TEX and pTEX) Latin font with 13 Q (13 quarter-millimeter = 3.25 mm) Japanese font. The choice is partly derived from the fact that many Japanese books are typeset with 13 Q fonts. The original choice by the pTEX developers was 9.62216 pt (about 3.3818 mm) Japanese for 10 pt Latin. As a comparison, the default font size of Microsoft Word in the Japanese environment is 10.5 pt (1 pt = 1/72 in) for both Japanese and Latin characters.
  42. Improve Unicode IVS code  I found that Unicode IVS

    code block only use CJK Unified Ideographs and its Extension Block A, B.  Submit bug and source code  Bugzilla: isCJKIVSCharacter needs to support CJK Unified Ideographs Extension Block C to H for Unicode15 https://bugs.documentfoundation.org/show_bug.cgi?id=155820  Code: https://gerrit.libreoffice.org/c/core/+/152995  Reviewer found that the code block is redundancy code and refectored it  Code: https://gerrit.libreoffice.org/q/topic:isCJKIVSCharacter
  43. CJK Issues  Characters in CJK Chinese Japanese Korean phonetic

    characters hiragana ひらがな katakana カタカナ Hangul jamo(consonants & vowels) 한글자모 Hangul syllable 한글음절 ideogram characters hànzì 漢字 (Traditional) 字 汉 (Simplified) kanji 漢字 かんじ Hanja 한자 漢字
  44. CJK Issues  Korean & Japanese adapted, use Chinese characters[Ideographs]

     However, Both Korean & Japanese grammar structures are different from Chinese that  Chinese : S + V + O (subject-verb-object word order)  Korean & Japanese: S + O + V (subject-object-verb word order)  Korean & Japanese use postposition particle(a word that is attached to the back of another word to indicate the word’s grammatical role or to add special meaning)  Korean: josa 조사 ( 助詞 ), Japanese: joshi 助詞
  45. CJK Issues  English: I ate a meal – I

    (Subject) ate(Verb) a meal(Object)  Chinese: 我吃了飯/ 我吃了饭 wǒ chīle fàn  我 (I – Subject ) 吃了 (ate – Verb ) 飯/ 饭 (meal - Object)  Korean: 나는 밥을 먹었어 na-neun bab-eul meog-eoss-eo  나 (I) 는 (postposition particle) 밥 (meal - Object) 을 (postposition particle) 먹었어 (ate - Verb)  Japanese: 私はご飯を食べた watashiwa gohan o tabeta  私(I)は(postposition particle)ご飯(meal - Object)を(postposition particle)食べた(ate - Verb)
  46. CJK Issues  Subject, Verb, Object, Postposition particle  English:

    I love you.  German: Ich liebe dich.  Chinese: 我愛你.(traditional) 我爱你 .(simplified)  Korean: 나는 너를 사랑한다 .  Japanese: 私はあなたを愛してる。  Mongolian: Би чамд хайртай | bi chamd khairtai S + O + V S + V + O
  47. CJK Issues  京 (meaning: Capital) Chinese Korean Japanese Sound

    音 Meaning 뜻 / 훈 ( 訓 ) Sound 소리 / 음 ( 音 ) Meaning 訓 み 読 Sound 音 み 読 jīng 서울 Seoul 경 gyeong みやこ miyako 音 呉 : キョウ (kyou) 漢音 : ケイ (kei) 唐音 : キン (kin) Ref. South Korea https://hanja.dict.naver.com/#/entry/ccko/ef212ae49efc4c39af1e8b44bd13fc5d Ref. North Korea https://mirror.adversec.com/dprk/dprk-34c4/DPRK-34C3-PDFs/Education/Secondary/1/ chinese_ideograph.pdf
  48. CJK Issues  京 (meaning: Capital)  Korean: Almost use

    reading “sound part”  Except. Korean-made Hanja 乭 돌 (meaning: stone), number 五 (meaing 5, 다섯 ‘ da-seot’, sound: 오 ‘ o’ )  Alphago against go-player name Lee se-dol 이세돌 ( 李世乭 )  Japanese: sometimes reading “meaning parts” or “Sound parts”
  49. Korean Numbering Texts Number Korean Counting Korean Legal Korean Digital

    Korea Digital2 1 일 하나 일 一 2 이 둘 이 二 3 삼 셋 삼 三 4 사 넷 사 四 5 오 다섯 오 五 6 육 여섯 육 六 7 칠 일곱 칠 七 8 팔 여덟 팔 八 9 구 아홉 구 九 10 십 열 일영 一零 11 십일 열하나 일일 一一 20 이십 스물 이영 二零 30 삼십 서른 삼영 三零 99 구십구 아흔아홉 구구 九九
  50. Google IO Keynote 2023  Why Google’s Bard choose both

    Korean and Japanese as it Priority Service Language? https://www.youtube.com/live/cNfINi5CNbY?si=eettGWMm53y-Qn0f&t=1658  Both languages have similar grammatical structures: S + O + V  Complexed & Unique character systems  Korean: Hangul syllable 한글 음절 , Hangul jamo 한글 자모 , Hanja 漢字  Japanese: Hiragana ひらがな , Katakana カタカナ , Kanji 漢字  Google is not exclusive to both countries in the search engine market  Korea: Naver, Japan: Yahoo Japan
  51. Long term Plans – etc  Fix LibreOffice Korean bugs

     Vertical writing  Make some features  Korean Hanja dictionary, Korean Hanja dictionary for Buddhism, etc  Document Templates for Korean  ‘hwpx’ file format support
  52. Vertical Writing  In East Asian regions(CJK), Many East Asian(CJK)

    scripts can be written in the horizontal or vertical manner  Example of CJK Vertical Writing 1. From top to bottom 2. Ordered from right to left ① ②
  53. Vertical Writing  국립국어원 ( 國立國語院 National Institute of the

    Korean Language)’s Korean vertical writing manual http://kornorms.korean.go.kr/regltn/popup/regltnNtfcView.do?regltn_code=0001&ntfc _no=6&ntfc_hist_no=1001  W3C Requirements for Hangul Text Layout and Typography : 한국어 텍스트 레이아웃 및 타이포그래피를 위한 요구사항 https://www.w3.org/TR/klreq/  W3C Styling vertical Chinese, Japanese, Korean and Mongolian text https://www.w3.org/International/articles/vertical-text/  Unicode - Vertical Text Layout https://www.unicode.org/reports/tr50/
  54. Vertical Writing  Bug 132926 - Change Punctuations(comma & period)

    for Korean vertical writting text layout  https://bugs.documentfoundation.org/show_bug.cgi?id=132926
  55. Korean Dictionaries  HWP 2022’s dictionaries  Korean Hanja Dictionary

     Korean-English, English-Korean Dictionary, etc
  56. Korean Dictionaries  Korean Hanja dictionary for Buddhism  Example1)

    金剛般若波羅蜜經  Korean Hanja Sound: 금강반약파라밀경 (geum gang ban yak pa ra mil gyeong)  Korean Buddhism Sound: 금강반야바라밀경 (geum gang ban ya ba ra mil gyoeng)  Example2) 阿耨多羅三藐三菩提  Korean Hanja Sound: 아누다라삼막삼보제 (a nu da ra sam mak sam bo je)  Korean Buddhism Sound: 아뇩다라삼먁삼보리 (a nyok da ra sam myak sam bo ri)
  57. Long term Plans  HWP’s Document Templates List  Plan:

    Add Korean Document Templates as LibreOffice Extension or …