copy_if destination using an ostream_iterator remains waiting for input....

Hi,

I have an unexplainable behavior in the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
static bool begins_with_a(const std::string& s)
{
	return s.find("a") == 0;
}

static bool ends_with_b(const std::string& s)
{
	return s.rfind("b") == s.length() - 1;
}

template<typename A, typename B, typename BinaryOp>
auto combine(BinaryOp op, A a, B b)
{
	return [=](auto param)
        {
		return op(a(param), b(param));
	};
}

void useCombine()
{
	using namespace std;

	auto a_xxx_b = combine(logical_and<>{}, begins_with_a, ends_with_b);
		
	copy_if(istream_iterator<string>{cin}, istream_iterator<string>{},
				ostream_iterator<string>{cout, ", "},
				a_xxx_b);

	cout << endl;
}


the problem is that upon execution of the copy_if, after I type the input (say ab attj arb), the correct words get displayed (in this example, ab arb) but after that, the console remains waiting for input, which it does until I type Ctrl+Z.

I tried to include the end of input character (i.e. Ctrl+Z) in the same line as the words (e.g. ab attj arb Ctrl+Z) the end of istream would be reached and execution would continue at the next line (i.e. cout << endl), but it does not.
I still need to press Ctrl+Z in order for the control to move to cout << endl.

What is going on here?

Regards,
Juan Dent
I have simplified it so as to isolate the part I do not understand:

1
2
vector<string> va;
copy(istream_iterator<string>{cin}, istream_iterator<string>{}, back_inserter(va));


This is my question:

Why do I need to type Ctrl+Z by itself in order for reading from cin finishes? I mean why doesn't input of some letters followed by Ctrl+Z indicate to cin that EOF has been reached?

On the other hand, if reading is made up of integers, then a Ctrl+Z will immediately signal EOF regardless if it occurs by itself or not.

In other words, if reading ints then this gets input and finishes input:

120 657 Ctrl+Z

Whereas, reading strings, this gets the string tokens but does not finish input:

aab aas jsjjjs Ctrl+Z


All I can think of is that Ctrl+Z is taken to be white space when reading words. But then why does typing only Ctrl+Z finish input? It should disregard it as space!


Regards,
Juan Dent






You are talking about Ctrl-Z like it's a character. Ctrl-Z (or Ctrl-D on *nix) is not exactly a character because it has been hooked up to trigger the eof signal when pressed. For instance, it does not occur at the end of a file to indicate EOF. It is just a special key combination that can send an eof signal from the terminal.

In windows they seem to have decided that you can only send that signal at the beginning of a line.

Your example of a difference when reading ints or strings may indicate that when pressed anywhere but the beginning of a line it is treated like a character. Since it's not a digit it ends int input, but not string input.

Also, it's always best to post runnable code so we can easily play with it ourselves.
Last edited on
Solved:

the eof marker (Ctrl+Z) must be the first in the input buffer to trigger EOF behaviour in stream.

If preceded with other characters, it will not be taken as EOF.

Unless, if reading numbers, then they can be in front of Ctrl+Z and the stream will still take it to mean EOF reached.


Still, this behaviour surprises me ... Ctrl+Z should mean EOF regardless of whether it has other characters before it!
Interestingly, on linux I need to hit Ctrl-D twice to send eof at the end of a line but only once if at the beginning of a line.

Do you have a reference for your statements in the "Solved" post?

Also, your begins_with_a and ends_with_b functions can be improved. No need to use find. Also, your ends_with_b is not right unless you mean "ends_with_b_and_theres_no_other_bs_before_the_end".
1
2
3
4
5
6
static bool begins_with_a(const std::string& s) {
	return s.length() != 0 && s[0] == 'a';
}
static bool ends_with_b(const std::string& s) {
	return s.length() != 0 && s[s.length() - 1] == 'b';
}

Last edited on
tpb wrote:
on linux I need to hit Ctrl-D twice to send eof at the end of a line but only once if at the beginning of a line.

On Linux, when you're reading from console, your application is sitting in the POSIX read() system call, waiting for it to return.

read() against file descriptor zero (standard input) that wasn't redirected to read from a file returns when either Enter was pressed or Ctrl-D was pressed.
If you type "abc" and press Enter, read() returns 4 and writes "abc\n" to the buffer.
If you type "abc" and press Ctrl+D, read() returns 3 and writes "abc" to the buffer.

The way end of input is communicated from POSIX API layer to C I/O stream layer is that the read() call returns zero. To do that, you have to press Ctrl-D at the start of a line (after Enter) or twice (after another Ctrl-D), or as the very first input (I guess that also counts as 'beginning of a line')
Last edited on
Cubbi do you know how this eof works in Windows? at the system level?
Why does cin behave so different depending on the type of data to read, regarding the detection of EOF (Ctrl+Z on Windows)?
In other words, why does istream_iterator<string>{cin} disregards Ctrl+Z except if it is the first key in a line while istream_iterator<int>{cin} will recognize Ctrl+Z anywhere and end the input?
When you say it recognizes Ctrl-Z anywhere and ends the input, I assume you mean that you can write some numbers and then, without hitting return, you can press Ctrl-Z and then press return and that ends input. Whereas if it was reading strings and you did the same thing, you would have to press Ctrl-Z and return again on the next blank line to end the input. Right?

I gave my hypothesis for that above. You could test it (I can't since I don't have windows). Try this:
1
2
3
4
5
6
7
8
9
10
11
#include <iostream>
#include <string>
using namespace std;

int main() {
    string s;
    cin >> s;  // type "hello" then hit Ctrl-Z, then return
    cout << s.length() << '\n';  // should be 5 (for hello)
    if (s.length() > 5 && s[5] == char(26))
        cout << "Ctrl-Z found\n";
}

Thanks tpb -- very clear!!
JUAN DENT wrote:
Cubbi do you know how this eof works in Windows? at the system level?

no, but I just looked it up.

In the Windows Console, Ctrl-Z has no magic powers (unlike Ctrl-D on Linux): it actually creates a regular ASCII character code \x1A and puts it on the input line. You can keep typing after that, and press more Ctrl-Z, and keep building up the line of input. You need to press the Enter key to send the line to the program.

When the Windows C library processes the line buffer it got from the console (via WinAPI call ReadFile), it treats \x1A as a physical "end of file" character (for backwards compatibility with the long-extinct operating system CP/M).
* If \x1A is first in the line, C layer returns EOF and you're done.
* If \x1A is not the first character on the line of input, C processes the characters in the buffer up to and including the first \x1A, and throws away the rest of the line (including the \n, so if you read with something like getline(), you will need to press Enter again to see the line)

This is also the reason you can't read binary files on Windows in text mode: the first time you hit a \x1A, the read throws away the rest of the file (on Linux, and all POSIX systems, there is no distinction between binary and text mode)
Last edited on
Topic archived. No new replies allowed.