[Solved] Add an exclude array to an existing awk code


EDIT: OP told there could be words like "a" too so handle that case adding following now.

awk '
BEGIN{
  s1="\""
  num=split("McCartney feat. vs. CD USA NYC",array," ")
  for(k=1;k<=num;k++){
     temp=tolower(array[k])
     ignoreLetters[temp]=array[k]
  }
  num=split("a the to at in on with and but or",array," ")
  for(i=1;i<=num;i++){
    smallLetters[array[i]]=array[i]
  }
}
/TITLE/{
  for(i=2;i<=NF;i++){
    front=end=nothing=both=""
    if($i~/^"/ && $i!~/"$/){
      temp=tolower(substr($i,2))
      front=1
    }
    else if($i ~ /^".*"$/){
      temp=tolower(substr($i,2,length($i)-2))
      both=1
    }
    else if($i ~/"$/ && $i!~/^"/){
      temp=tolower(substr($i,1,length($i)-1))
      end=1
    }
    else{
      temp=tolower($i)
      nothing=1
    }
    if(temp in ignoreLetters){
      if(front){
         $i=s1 ignoreLetters[temp]
      }
      else if(end){
         $i=ignoreLetters[temp] s1
      }
      else if(both){
         $i=s1 ignoreLetters[temp] s1
      }
      else if(nothing){
         $i=ignoreLetters[temp]
      }
    }
    else if(temp in smallLetters){
      if(front){
         $i=s1 smallLetters[temp]
      }
      else if(end){
         $i=smallLetters[temp] s1
      }
      else if(nothing){
         $i=smallLetters[temp]
      }
      else if(both){
         $i=s1 smallLetters[temp] s1
      }
    }
    else{
      if($i~/^\"/){
        $i=substr($i,1,1) toupper(substr($i,2,1)) substr($i,3)
      }
      else{
        $i=toupper(substr($i,1,1)) substr($i,2)
      }
    }
  }
}
1
'  Input_file


Could you please try following.

awk '
BEGIN{
  s1="\""
  num=split("McCartney feat. vs. CD USA NYC",array," ")
  for(k=1;k<=num;k++){
     temp=tolower(array[k])
     ignoreLetters[temp]=array[k]
  }
  num=split("a the to at in on with and but or",array," ")
  for(i=1;i<=num;i++){
    smallLetters[array[i]]=array[i]
  }
}
/TITLE/{
  for(i=2;i<=NF;i++){
    front=end=nothing=""
    if($i~/^"/){
      temp=tolower(substr($i,2))
      front=1
    }
    else if($i ~/"$/){
      temp=tolower(substr($i,1,length($i)-1))
      end=1
    }
    else{
      temp=tolower($i)
      nothing=1
    }
    if(temp in ignoreLetters){
      if(front){
         $i=s1 ignoreLetters[temp]
      }
      else if(end){
         $i=ignoreLetters[temp] s1
      }
      else if(nothing){
         $i=ignoreLetters[temp]
      }
    }
    else if(tolower($i) in smallLetters){
      $i=tolower(substr($i,1,1)) substr($i,2)
    }
    else{
      if($i~/^\"/){
        $i=substr($i,1,1) toupper(substr($i,2,1)) substr($i,3)
      }
      else{
        $i=toupper(substr($i,1,1)) substr($i,2)
      }
    }
  }
}
1
'  Input_file

Output will be as follows:

FILE "Two The Beatles Songs.wav" WAVE
  TRACK 01 AUDIO
TITLE "Dig a Pony, feat. Paul McCartney"
    PERFORMER "The Beatles"
    INDEX 01 00:00:00
  TRACK 02 AUDIO
TITLE "From Me to You"
    PERFORMER "The Beatles"
    INDEX 01 03:58:02

What does code take care of:

  • It takes care of making mentioned words into small letters.
  • It takes care of making some letters as per their style, mentioned by OP in question.
  • It takes of rest of fields which DO NOT fall in any of above category and makes their first letter as capital letter.
  • Code also takes care of words starting with " OR ending with " too, it will first remove them to check if they are present into user mentioned array or not and later add them as per their position.

1

solved Add an exclude array to an existing awk code