unsigned integer underflow #1696

mxyns · 2023-10-09T09:52:32Z

Hello,

While using nom I encountered an issue with this impl of Offset for [u8]

Lines 599 to 606 in 8c68e22

    
           impl Offset for [u8] { 
        
             fn offset(&self, second: &Self) -> usize { 
        
               let fst = self.as_ptr(); 
        
               let snd = second.as_ptr(); 
        
               snd as usize - fst as usize 
        
             } 
        
           }

The return value underflows when the first span is after the second in memory. This means the user always has to think about whether to use first.offset(second) or second.offset(first):

fn main() {

    let buf = [0,1,2,3,4,5,6,7,8,9];
    let mut first: LocatedSpan<&[u8]>= LocatedSpan::from(&buf[..]);
    let mut second = some_parser::read(first);

    let offset = first.offset(&second);
    println!("offset is negative: {}", offset);

    let offset = second.offset(&first);
    println!("offset is positive: {}", offset);
}

To me, an offset means it may be negative but maybe changing the return type of the offset method to isize may be problematic elsewhere, I don't know a lot about the internals of nom.

Maybe just doing an absolute value of the result could be enough? or let Offset have a signed version of the offset method?

The text was updated successfully, but these errors were encountered:

danielocfb · 2024-05-03T16:32:14Z

Seeing the same issue. Also, I am not sure if "user error" is the right connotation here. The trait seems pretty broken if you ask me, as it appears to make the assumption that self and second are related, but that is not enforced or even documented anywhere to the degree I can tell. Okay, perhaps one can guess that relationship by what it does. And yet, this functionality is used in higher level APIs where this contract seems lost entirely.

E.g., this program panics to due over/underflow:

use nom::error::ErrorKind::Tag;
use nom::error::VerboseErrorKind::Nom;
use nom::error::convert_error;
use nom::error::VerboseError;

fn main() {
        let input = [31, 139, 8, 8, 85, 135, 48, 102, 2, 255, 108, 105, 98];
        let err = VerboseError {
            errors: vec![(String::from_utf8_lossy(&input), Nom(Tag))],
        };
        let _x = convert_error(String::from_utf8_lossy(&input), err);
}

the problem isn't really obvious at all. And in a general setting, neither may the fix be (on the user's end).

We have gotten reports of a crash caused by numeric overflow when attempting to parse a Breakpad file. The reason lies in the error conversion functionality that nom provides. Specifically, the nom::error::convert_error() function expects errors with a context that derefs to an str slice. In order to fulfill that contract we convert everything to Cow<str>. However, the function also implicitly assumes that substrings belong to the same input slice. That is not the case, and cannot be easily fixed, because there is no sane way of "relocating" arbitrary byte slice after a lossy string conversion. The requirement of derefing into a str while *also* mapping to byte slices and are assumed to be subslices of each other just seems plain broken. This change imports a copy of the convert_error() function and fixes up the broken bits. The upstream issue touching on this problem is: rust-bakery/nom#1696 Signed-off-by: Daniel Müller <[email protected]>

Geal · 2024-05-05T15:28:33Z

@mxyns @danielocfb as you saw, Offset assumes that the argument is part of the same slice and that should be documented. I looked at the change in blazesym is why the lossy conversion step would be needed in the first place? Because convert_error was not available for &[u8] inputs?

danielocfb · 2024-05-06T16:15:24Z

I looked at the change in blazesym is why the lossy conversion step would be needed in the first place? Because convert_error was not available for &[u8] inputs?

Yes, that is the reason. The function assumes something that derefs into str. So I we need some kind of conversion first.

Btw., I accidentally commented what is not the most fitting issue, I think #1619 actually covers this very problem already.

danielocfb mentioned this issue May 3, 2024

Fix potential panic when reporting nom errors libbpf/blazesym#658

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unsigned integer underflow #1696

unsigned integer underflow #1696

mxyns commented Oct 9, 2023

danielocfb commented May 3, 2024 •

edited

Loading

Geal commented May 5, 2024

danielocfb commented May 6, 2024

unsigned integer underflow #1696

unsigned integer underflow #1696

Comments

mxyns commented Oct 9, 2023

danielocfb commented May 3, 2024 • edited Loading

Geal commented May 5, 2024

danielocfb commented May 6, 2024

danielocfb commented May 3, 2024 •

edited

Loading