我有一个项目,我å¯ä»¥ä½œå…¶å¯¼å¸ˆã€‚ 按开æºå¤ä»¤è¥ï¼Œæ¯ä¸ªé¡¹ç›®éœ€è¦ä¸€ä¸ªå®¿ä¸»å•ä½ï¼Œè¿™ä¸ªé¡¹ç›®éœ€è¦ä¸€ä¸ªå®¿ä¸»å•ä½ï¼Œæˆ‘挂è°å下呢? 这个项目其实是佟辉è€å¸ˆè·Ÿæˆ‘讨论如何培è®å¦ç”Ÿåšä¸€ä¸ª 1. 确实没人åšè¿‡çš„。 2. 确实有需求的。 3. 适åˆå¼€æº/è‡ªç”±è½¯ä»¶é£Žæ ¼çš„ã€‚ 4. 使用基础Cè¯è¨€ç¼–程就å¯ä»¥åšåˆ°çš„。 5. 很å°çš„项目。 作用是æä¾›å¦ç”Ÿä»¬ä¸€ä¸ªå¼€æºå®žè·µçš„机会,获得长进,åŒæ—¶å¯¹å¼€æº/自由软件社区确实有所贡献。 è®¾è®¡è¿™æ ·ä¸€ä¸ªé¡¹ç›®å¾ˆå›°éš¾ï¼Œå› ä¸ºæ‰€æœ‰ä½¿ç”¨åŸºç¡€Cè¯è¨€ç¼–程就å¯ä»¥åšåˆ°çš„,很å°è€Œç¡®å®žæœ‰ 需求的项目都有人åšè¿‡äº†ã€‚ 我们共åŒè®¾è®¡è¿‡ä¸€æ®µæ—¶é—´ï¼Œè®¾è®¡å‡ºä¸€ä¸ªé¢˜ç›®ï¼ŒåŽŸæ‰“算让我åšå¯¼å¸ˆå¸¦ä½Ÿè€å¸ˆçš„å¦ç”Ÿåšï¼ˆé‚£ 时候他是一线åšæ•™å¦çš„),但是这个看起æ¥éš¾åº¦ä¸å¤§çš„项目,竟然没有å¦ç”Ÿä¼šåšï¼ŒçŽ°åœ¨ CSDN的读者圈更大,也许å¯ä»¥å‡ºæ¥åšã€‚ è¿™æ ·æˆ‘çŽ°åœ¨ä»ç„¶æ„¿æ„当导师,就跟当时跟佟è€å¸ˆä¸€èµ·å·¥ä½œçš„ä¸€æ ·ã€‚ 项目需求è§æœ¬ä¿¡é™„件。Title: foldcolumn
Linux命令columnæÂÂ供了-tå‚数,这个å‚数å¯以使表格数æ®(典型的是TSV数æ®)格å¼Â化以便用在å±Â幕显示上ã€Â电åÂÂ邮件æ£文ä¸ÂåŠ打å°ä¸Â。
Linux命令foldæÂÂ供了-wå‚数,å¯以使文本按指定列宽æ¢行。
column -t
åª管排布表格ä¸Â管æ¢行,实现的效果比较基础,实际使用ä¸Â,表格内的å•元格也需è¦Â能æ¢行。这就需è¦Âfold和column命令的功能结åˆ起æ¥。这就是本项目的需求。实际使用ä¸Â还有一些其它的需求。
è¦Â求实现一个程åºÂfoldcolumn,具有下述功能:
传统column -t输出的行宽总是å–决于最宽的数æ®列,永ä¸Âåš“过宽æ¢行â€Â。foldcolumn需è¦Â一个--width=å‚数(简写为-w),指定输出媒体的宽度。比如:foldcolumn -w 80指定输入æ¯Â行80个å—符ä½Â这样宽(ä¸Â是æ¯Â行80个å—符)。指定了宽度,对于太宽的数æ®列就必须æ¢行。举例而言,对于输入:
$ printf "column 1\tcolumn 2\n \è¦Â实现这个需求,有两个难点。一是列宽的算法。高增ç¦ <pgf00a gmail com>æÂÂ出了一个简å•的算法以供å‚考:
The quick brown fox jumps over a lazy dog. \
The quick brown fox jumps over a lazy dog. \
The quick brown fox jumps over a lazy dog. \
The quick brown fox jumps over a lazy dog. \t \
That's it. Thanks reading.\n" | /usr/loca/bin/foldcolumn -w 80
column 1 column 2
The quick brown fox jumps over a lazy dog. The That's it.
quick brown fox jumps over a lazy dog. The quick Thanks
brown fox jumps over a lazy dog. The quick brown reading.
fox jumps over a lazy dog.
å‡设:
- å—符ç‰宽
- å•è¯Âå¯以在任æ„Âä½Âç½®æ–Â开(补充连å—符)
- ä¸Â必åÂÂ分精确
- æ¯Â行至少å¯以放下所有列
那么:
对于å‰Âé¢给定的例åÂÂ,差ä¸Â多就是6比1。 行数最少的情况是:
- 把所有文å—连ç»Â写,ä¸Â分列的情况总行数最少
- è¦Âåš的就是把所有文å—连ç»Â写,ä¸Â过改å˜一下文å—的顺åºÂ,产生分列的效果
- æ¯Â列的宽度为:行宽*(列å—符数/总å—符数)
- 算法就是:对于æ¯Â行,先写第一列,到达第一列列宽åŽ,写第二列 ...
The quick brown fox jumps over a lazy dog.The quick brown
fox jumps over a lazy dog.The quick brown fox jumps over
a lazy dog.The quick brown fox jumps over a lazy dog. That's
it. Thanks reading.å³所有内容连ç»Â的输出。
è¦Âåš的就是把所有文å—连ç»Â写,ä¸Â过改å˜一下文å—的顺åºÂ,产生分列的效果。
上例就是将圈ä½Â的å—,白底黑底的互相交æ¢一下。
一行一行的输出
对于æ¯Â行:
先写第一列,到达第一列列宽åŽ,
写第二列,到达第二列的列宽åŽ,
写第三列,
......
列宽的比例,就是å—符数的比例(因为å‡设了å—符ç‰宽,å¦则需è¦Â计算实际
宽度的比例)这个方法å¯以扩充的,比如å•è¯Âä¸Âå¯分的时候,å¯以动æ€Â调整, æ¯Â次动æ€Â调整幅度é€Âæ¸Âå‡Âå°Â,直到找到较为åˆ适的情况。 或者计算列宽时留出空挡,因为大家往往关注的是左边界的对é½Â。
张韡æ¦ <zhangweiwu realss com> 使用awkè¯Â言实现了这个算法,以供å‚考:
先制åš一个测试文件:
$ printf "The sun went down with practiced bravado. Twilight crawled across the \
sky, laden with foreboding. \tAfter Y2K, the end of the world had \
become a cliché. But who was I to talk, a brooding underdog avenger \
alone against an empire of evil out to right a grave injustice. \
Everything was subjective. There were only personal apocalypses. Nothing \
is a cliché when it is happening to you.\n\
It was a lucky break. The goons inside were spooked, but luck always came \
with a price tag.\tIt was not about how good you were. It was chaos \
and luck and anyone who thought differently was a fool.\n\
Snow fell like ash from post-apocalyptic skies.\t\
Who was I kidding, the best I was, was superman on kryptonite.\n" > /tmp/input.txtå†Â把下é¢的脚本å˜为 bin/foldcol.awk
#!/usr/bin/awk -f
# foldcol.awk: perform like fold(1) plus column(1)
# example: foldcol.awk width=72 pass=1 /tmp/input.txt pass=2 /tmp/input.txt
BEGIN { FS="\t"; width=80; } # table default to 80 columns wide
pass==1 {
for (i=1; i<=NF; i++) {
c[i] += length($i); # count of number of chars in each column
t += length($i); # total chars in all columns
}
}
pass==2 {
for (i=1; i<=NF; i++) {
cmd[i] = "echo '" $i "' | " "fold -sw" int(c[i]/t*width)
}
tr = 1 # number of columns that has output of fold(1)
while (tr > 0) { # go on until no column has output from fold(1)
tr = 0
for (i=1; i<=NF; i++) {
l = "";
tr += (cmd[i] | getline l)
printf "[%-" int(c[i]/t*width) "s]", l
}
print "";
}
for (i=1; i<=NF; i++) {
close(cmd[i])
}
}准备好脚本和测试数æ®åŽ,执行一下脚本:
$ foldcol.awk width=72 pass=1 /tmp/input.txt pass=2 /tmp/input.txt
[The sun went down with ][After Y2K, the end of the world had become a ]
[practiced bravado. ][cliché. But who was I to talk, a brooding ]
[Twilight crawled across ][underdog avenger alone against an empire of ]
[the sky, laden with ][evil out to right a grave injustice. ]
[foreboding. ][Everything was subjective. There were only ]
[ ][personal apocalypses. Nothing is a cliché ]
[ ][when it is happening to you. ]
[ ][ ]
[It was a lucky break. ][It was not about how good you were. It was ]
[The goons inside were ][chaos and luck and anyone who thought ]
[spooked, but luck ][differently was a fool. ]
[always came with a ][ ]
[price tag. ][ ]
[ ][ ]
[Snow fell like ash from ][Who was I kidding, the best I was, was ]
[post-apocalyptic skies. ][superman on kryptonite. ]
[ ][ ]
给定å‚数-b(block-drawing),å¯以使用制表符画表。 制表符是如下图所示的文å—。
┌─┬â”Â
│ ││
├─┼┤
└─┴┘
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
6 | ┘ | ┠| ┌ | └ | ┼ | |||||||||||
7 | ─ | ├ | ┤ | ┴ | ┬ | │ |
应该许å¯输入数æ®å•元格里带有硬回车。
$ printf "Math\tResult\n 1\r+ 2\r---\r 3\tCorrect\n 1\r- 2\r---\r 1\tIncorrect\n" | foldcolumn -b
┌────┬────────────â”Â
│Math│ Result │
├────┼────────────┤
│ 1 │ Correct │
│+ 2 │ │
│--- │ │
│ 3 │ │
├────┼────────────┤
│ 1 │ Incorrect │
│- 2 │ │
│--- │ │
│ 1 │ │
└────┴────────────┘
å¯以å‘用户æÂÂ供-f <precision>
å‚数,以便自动格å¼Â化数å—。例如:
$ printf '123.2321825\n125888.4888\n'
123.2321825
125888.4
$ printf '123.2328825\n125888.4\n' | foldcolumn -f 3
123.232
125888.489
注æ„Â-då‚数需è¦Â程åºÂ自己判æ–Âå•元格里是数å—还是å—符,并且数å—需è¦Âå‘å³对é½Â。
æÂÂ供-c <char>
å‚数,åÂŽé¢å¯以带一个å—符串。以æ¤å—符串开头的行,ä¸Âåš处ç†。这有些类似注释。
输入:
$ printf "Item\tPrice\nComputer\t4000.00\nDesk\t300.00\n* This price table does not include tax.\n" | ./foldcolumn -c '*' -f 2 Item Price Computer 4000.00 Desk 300.00 * This price table does not include tax.
和其它Linux命令行工具一样,对于æžÂ长的文本应该能从善如æµÂ。尤其是新程åºÂ员应该注æ„Âä¸Âè¦Â把所有东西都放在 内å˜里,第一éÂÂ应该åƒÂstrfile(1)这样先过一 éÂÂ,第二éÂÂå†Â输出。