使用split命令分割大文件

WindowsLinuxsplit大约 3725 字

Windows下需安装MINGW64MINGW32,若安装了git客户端,可使用Git Bash

Linux自带split命令。

示例

-a 3:以000开始命名

-d:以数字命名

-l 2:以两行分割

--additional-suffix=.txt:以.txt作为后缀

test.txt:分割的源文件

test-:子文件以test-开头

split -a 3 -d -l 2 --additional-suffix=.txt test.txt test-

帮助文档

Usage: split [OPTION]... [FILE [PREFIX]]
Output pieces of FILE to PREFIXaa, PREFIXab, ...;
default size is 1000 lines, and default PREFIX is 'x'.

With no FILE, or when FILE is -, read standard input.

Mandatory arguments to long options are mandatory for short options too.
  -a, --suffix-length=N   generate suffixes of length N (default 2)
      --additional-suffix=SUFFIX  append an additional SUFFIX to file names
  -b, --bytes=SIZE        put SIZE bytes per output file
  -C, --line-bytes=SIZE   put at most SIZE bytes of records per output file
  -d                      use numeric suffixes starting at 0, not alphabetic
      --numeric-suffixes[=FROM]  same as -d, but allow setting the start value
  -x                      use hex suffixes starting at 0, not alphabetic
      --hex-suffixes[=FROM]  same as -x, but allow setting the start value
  -e, --elide-empty-files  do not generate empty output files with '-n'
      --filter=COMMAND    write to shell COMMAND; file name is $FILE
  -l, --lines=NUMBER      put NUMBER lines/records per output file
  -n, --number=CHUNKS     generate CHUNKS output files; see explanation below
  -t, --separator=SEP     use SEP instead of newline as the record separator;
                            '\0' (zero) specifies the NUL character
  -u, --unbuffered        immediately copy input to output with '-n r/...'
      --verbose           print a diagnostic just before each
                            output file is opened
      --help     display this help and exit
      --version  output version information and exit

The SIZE argument is an integer and optional unit (example: 10K is 10*1024).
Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000).
Binary prefixes can be used, too: KiB=K, MiB=M, and so on.

CHUNKS may be:
  N       split into N files based on size of input
  K/N     output Kth of N to stdout
  l/N     split into N files without splitting lines/records
  l/K/N   output Kth of N to stdout without splitting lines/records
  r/N     like 'l' but use round robin distribution
  r/K/N   likewise but only output Kth of N to stdout

GNU coreutils online help: <https://www.gnu.org/software/coreutils/>
Report any translation bugs to <https://translationproject.org/team/>
Full documentation <https://www.gnu.org/software/coreutils/split>
or available locally via: info '(coreutils) split invocation'

参数解释

-a, --suffix-lenth=N

-a--suffix-lenth=N简写。

生成后缀的位数,默认两位。如:test_01。

--additional-suffix=SUFFIX

为分割后的文件添加后缀名。

-b, --bytes=SIZE

按大小分割文件,单位:K,M,G,T,P,E,Z,Y,默认:字节。

-d

以数字后缀开始命名分割后的文件,原始以xaa...

--numeric-suffixes[=FROM]

类似-d,还可设置起始值,默认以0开始。

-x, --numeric-suffixes[=FROM]

-d--numeric-suffixes[=FROM]功能相同,只是以十六进制命名分割后的文件。

-e, --elide-empty-files

结合-n使用,不生成空的输出文件。

--filter=COMMAND

将结果作为COMMAND的输入,类似于管道符。

-l, --lines=NUMBER

以行数来分割文件

-n, --number=CHUNKS

  • N split into N files based on size of input
    • 基于大小将输入分成N份
  • K/N output Kth of N to stdout
    • 将输入基于大小分成N份,将第K份打印
  • l/N split into N files without splitting lines
    • 基于大小将输入分成N份,但不截断行,有断行往前补齐
  • l/K/N output Kth of N to stdout without splitting lines
    • 将输入基于大小分成N份,将第K份打印,但不截断行,有断行往前补齐
  • r/N like 'l' but use round robin distribution
    • 同l,使用循环分配
  • r/K/N likewise but only output Kth of N to stdout
    • 同l,使用循环分配,分成N份,将第K份打印,但不截断行,有断行往前补齐

-t, --separator=SEP

使用指定字符而不是换行作为分隔符; '\0' 指定NUL字符为分隔符

-u, --unbuffered

将输入立即写入到输出文件,结合-n r/...使用。

--verbose

在打开或创建每个输出文件前,打印诊断信息。

--help

显示split命令的帮助信息并退出

--version

打印版本信息并退出

阅读 700 · 发布于 2019-07-19

————        END        ————

扫描下方二维码关注公众号和小程序↓↓↓

昵称:
随便看看换一批