[pan2: 14/23] Guess deliberate line breaks
- From: Petr Kovář <pmkovar src gnome org>
- To: commits-list gnome org
- Cc:
- Subject: [pan2: 14/23] Guess deliberate line breaks
- Date: Sun, 29 May 2011 13:04:43 +0000 (UTC)
commit 3741e64e23909d60a21cc072105c832d2cb3b522
Author: K. Haley <haleykd users sf net>
Date: Sat May 15 00:49:38 2010 -0600
Guess deliberate line breaks
Guess wrap length in original message be finding max line length.
If first word on a line could have been put on previous line
assume it was a deliberate line break.
pan/usenet-utils/text-massager-test.cc | 16 +++++++++-------
pan/usenet-utils/text-massager.cc | 18 +++++++++++++++---
2 files changed, 24 insertions(+), 10 deletions(-)
---
diff --git a/pan/usenet-utils/text-massager-test.cc b/pan/usenet-utils/text-massager-test.cc
index 3013e95..6043383 100644
--- a/pan/usenet-utils/text-massager-test.cc
+++ b/pan/usenet-utils/text-massager-test.cc
@@ -71,11 +71,12 @@ int main (void)
"Cybe R. Wizard wrote:\n"
"\n"
"> Nice to know it works, right, and that's why I\n"
-"> tried it. I ran SETI home under win95 for a\n"
-"> while but on my Pentium 166 it's not really\n"
-"> worth it. It took upwards of 500 hours to do\n"
-"> one WU running full time in the background. Will\n"
-"> the Linux version do better???\n"
+"> tried it.\n"
+"> I ran SETI home under win95 for a while but on\n"
+"> my Pentium 166 it's not really worth it. It\n"
+"> took upwards of 500 hours to do one WU running\n"
+"> full time in the background.\n"
+"> Will the Linux version do better???\n"
"\n"
"500 hours seems like an awfully long time to me...\n"
"I'm running setiathome on all my systems, and on\n"
@@ -86,8 +87,8 @@ int main (void)
"> that came with my Mandrake 7.2 the Galaxies 2.0\n"
"> screensaver ran VERY slowly. I had no real hope\n"
"> that Codeweaver's wine would do any better but\n"
-"> the thing runs FASTER than under win95. I wonder\n"
-"> why that is...\n"
+"> the thing runs FASTER than under win95.\n"
+"> I wonder why that is...\n"
"\n"
"Heh, I remember OS/2 running Windows programs\n"
"faster than windows did :^)\n"
@@ -99,6 +100,7 @@ int main (void)
"\n"
"Jan Eric";
out = tm.fill (in);
+ std::cout<<out<<std::endl;
check (out == expected_out);
/* wrap real-world 2 */
diff --git a/pan/usenet-utils/text-massager.cc b/pan/usenet-utils/text-massager.cc
index d34d808..10c7cf7 100644
--- a/pan/usenet-utils/text-massager.cc
+++ b/pan/usenet-utils/text-massager.cc
@@ -111,11 +111,17 @@ namespace
void merge_fixed (paragraphs_t ¶graphs, lines_t &lines, int wrap_col)
{
int prev_content_len = 0;
+ int max_len = wrap_col;
StringView cur_leader;
std::string cur_content;
for (lines_cit it=lines.begin(), end=lines.end(); it!=end; ++it)
{
+ const Line& line (*it);
+ max_len = MAX(max_len, line.leader.len + line.content.len);
+ }
+ for (lines_cit it=lines.begin(), end=lines.end(); it!=end; ++it)
+ {
const Line& line (*it);
bool paragraph_end = true;
bool hard_break = false;
@@ -128,9 +134,15 @@ namespace
paragraph_end = true;
}
- // we usually don't want to wrap really short lines
- if (prev_content_len && prev_content_len<(wrap_col/2))
- paragraph_end = true;
+ // if first word could have been wrapped onto previous line
+ // line but wasn't assume deliberate line break.
+ if (!paragraph_end && prev_content_len && line.content.len)
+ {
+ int space = max_len - (prev_content_len + line.leader.len) - 1;
+ if ( space > 0 && ((line.content.len < space)
+ || g_utf8_strchr (line.content.str, space, ' ')) )
+ paragraph_end = true;
+ }
if (paragraph_end) // the new line is a new paragraph, so save old
{
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]