I’m trying to translate a strange derivative of xml into valid xml. Here
is an example line:
<SUBEVENTSTATUS
1:2>gofaststoppednameval</SUBEVENTSTATUS
1:1><SUBEVENTSTATUS 2:2><…and on
REXML pukes on the <SUBEVENTSTATUS 1:2> tag… which it should. There
should be some kind of attribute declaration instead. I want to
translate it to something like this:
I’m trying to make a regex to detect the funny tags. Here is what I have
so far:
xml_fix=/<(\S+)\s+(\d+):(\d+)>/
This is great, but it will match this:
<code_set_list 1:2>
instead of just this:
<code_set_list 1:2>
…because there is no gauranteed whitespace between tags. Basically, I
need to stop matching if a “>” is found. I’ve never had to deal with
anything quite like this in my regex experience. Any help or thoughts of
a better way to do things is much appreciated!
…because there is no gauranteed whitespace between tags. Basically, I
need to stop matching if a “>” is found. I’ve never had to deal with
anything quite like this in my regex experience. Any help or thoughts of
a better way to do things is much appreciated!
…because there is no gauranteed whitespace between tags. Basically, I
need to stop matching if a “>” is found. I’ve never had to deal with
anything quite like this in my regex experience. Any help or thoughts of
a better way to do things is much appreciated!
I can think of several solutions:
/<([^>\s]+)\s+(\d+):(\d+)>/
Or even a two phased approach
/<[^>]+>/
and then with the match
/(\d+):(\d+)>\z/
HTH
robert
awesome, and thank you! but for my benefit, could you explain why that
works? I thought ^ was line start?
…because there is no gauranteed whitespace between tags. Basically, I
need to stop matching if a “>” is found. I’ve never had to deal with
anything quite like this in my regex experience. Any help or thoughts of
a better way to do things is much appreciated!
I’d simply use /<[^>]+\s+(\d+):(\d+)>/ (untested, but you get my
drift)…