grep expression

I am having trouble trying to find an expression that will match a word that starts with "b" ends with "t" and contains "o". The following expression grabs something that begins with "b" and ends with "t" but doesn't account for "o" ^[Bb]..t$ I understand that having brackets next to each other[][] will match the next letter in a word. I just can't figure out how I would look for say a letter five characters down. Any help would nice.
Last edited on
I figured it out. I piped it.

grep "^b" file.txt | grep "t$" | grep "o"
A regex will work too:
grep "^[Bb].*o.*t$" file.txt

Read the first few paragraphs of the man page: http://man7.org/linux/man-pages/man7/regex.7.html
Why is it when I use this regex with grep in a file,
grep  ^$1.*$2.*$3$ $4

it will only work if there is one word per line.
How it entered into comand line - Unix> ./find_words.sh M a y userD.txt

Also when using a variable how do I do ^[Bb]. Would this work ^[*$1]
Last edited on
^ - beginning of line
$ - end of line

So your regex will only find lines that begin with (using your original example) B or b, end with t, contain an o and contain no spaces (.*).

Try using this Perl regex (-P):
grep -P \\b\(?i\)$1.*$2.*$3\\b $4

(?i) - turn on case insentive
\b - word boundary

The backslashes and parentheses have special meaning in bash so they need to be escaped for literal interpretation
I tried word boundary earlier and can not get it to work. I am on vm using lubuntu. Would that matter? Here is what happens -

Enter this in command line - ./find_words.sh F e d test.txt

Script -
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
  //find_words.sh
  1 #!/bin/bash
  2 
  3 #test for 2 command line arguments
  4 if [ $# -ne 4 ]; then
  5 
  6 #print out error message and exit
  7         echo usage: Need 4 arguments
  8         exit 1
  9 fi
 10 #       grep -P \\b\(?i\)$1.*$2.*$3\\b $4  test.txt | while read -r line ; do
 11 #       echo $line
 12 #done
 13 #       grep  ^$1.*$2.*$3$ $4
 14 
 15         grep -P \\b\(?i\)"$1".*"$2".*"$3"\\b "$4" 

test.txt -
1 Fred that Foed
2 Fred this
3 Mary

output -
Fred that Foed
Fred this

I just don't understand why word boundary isn't working.

retroCheck wrote:
I am on vm using lubuntu. Would that matter?
Don't think so. I'm running Debian Sid.

Are you expecting only the words that match to print out? If so, grep won't work.
From the grep man page:
grep, egrep, fgrep, rgrep - print lines matching a pattern


The man page says that -o should allow me to but still doesnt work.

-o
--only-matching
Show only the part of a matching line that matches PATTERN.
I couldn't get -o to work either!

Long way from C++, but how about a simple Perl script:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#!/usr/bin/perl

if ($#ARGV != 3 ) {
    die "Requires 4 arguments";
}

my $begin = $ARGV[0];
my $letter = $ARGV[1];
my $end = $ARGV[2];

open (INFILE,  $ARGV[3])
    or die "Couldn't read from $ARGV[3]";

while (my $line = <INFILE>) {
    my @matches=($line=~ /(\b(?i)$begin[a-z]*$letter[a-z]*$end\b)/gim); 
    for my $word (@matches) {
        print  $word, "\n";
    }
}

close INFILE;
> The backslashes and parentheses have special meaning in bash
> so they need to be escaped for literal interpretation
you may also quote them


> I couldn't get -o to work either!
The period . matches any single character. it matches spaces too.

in your perl script you change it for [a-z]


> Long way from C++
oh yeah, this is a C++ forum
# echo "foo bar gaz bur bum bear" | grep -o "\bb[^ ]*a[^ ]*r\b"
bar
bear
# rpm -q grep
grep-2.6.3-6.el6.x86_64

(The newline between matched words is semi-surprising.)

The "\bb[^ ]*a[^ ]*r\b" contains:
\b word boundary
b literal 'b'
[^ ]* 0 or more non-space
a literal 'a'
[^ ]* 0 or more non-space
r literal 'r'
\b word boundary

Apparently the el6 grep takes the \b without the perl-flag.
Positional parameter substitution and escape characters with BASH script ... you already had the hang of them.
Got it to work, seems much of the issue was with quotes.

grep -o '\b'$1'\w*'$2'\w*'$3'\b' $4

Of course keskiverto example works too -
grep -o "\b"$1"[^ ]*"$2"[^ ]*"$3"\b" $4

Thanks!!! I was struggling with for awhile.
Last edited on
¿why do you leave your variables outside the quotes?
grep -o "\b$1[^ ]*$2[^ ]*$3\b" "$4"
He was using both single and double quotes. One prevents variable substitution, the other does not. It can be tricky to notice all the possibilities.
Topic archived. No new replies allowed.