Unzipping Chinese Filenames

Published | Go Back

Unzipping a zip archive that has Chinese file­names in­side is tricky. I haven’t been able to find a per­fect so­lu­tion for quite a long time. Instead, I have to use a workaround. Recently, I came cross this post which con­firms my find­ings.

Let’s talk about the is­sue first. Suppose I com­press a file with a Chinese file­name into a zip archive on the Windows plat­form. Now I want to ex­tract that file on Linux. The unzip tool used to work per­fectly with the -O op­tion, but one day I found that the op­tion was gone. Without that op­tion, the ex­tracted file will have a hu­man un­read­able file­name.

The is­sue is caused by en­cod­ing for sure. The workaround is sim­ple. Just find any un­zip­ping util­ity that sup­ports spec­i­fy­ing en­cod­ing. To me, the best workaround would be Unarchiver (unar) which I found in AUR pre­vi­ously. It worked well even with­out spec­i­fy­ing en­cod­ing. The post men­tioned at the be­gin­ning con­firms this. In ad­di­tion, the com­ments un­der that post also sug­gests a patched ver­sion of unzip, namely unzip-iconv, which adds back the -O op­tion.