Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser error messages #208

Closed
4 tasks done
rachitnigam opened this issue Sep 18, 2020 · 10 comments · Fixed by #301
Closed
4 tasks done

Parser error messages #208

rachitnigam opened this issue Sep 18, 2020 · 10 comments · Fixed by #301
Labels
Type: Paper cut Spurious or confusing errors

Comments

@rachitnigam
Copy link
Contributor

rachitnigam commented Sep 18, 2020

The parser gives bad error messages when:

  • When a cell is missing a semicolon
  • When numbers without bitwidths are used.
  • When while bodies don't have seq inside them.
  • When you have a non-binary number in a binary literal like in 32'b4
@rachitnigam rachitnigam added the Type: Paper cut Spurious or confusing errors label Sep 18, 2020
@rachitnigam
Copy link
Contributor Author

@sgpthomas @tedbauer add issues here as you find them

@cgyurgyik
Copy link
Collaborator

So I’ve done some digging up on pest. A common consensus from its users is “the error messages suck.” This seems to be a problem in FuTIL as well. I've raised an issue asking about getting a nifty error for binary numbers without 0 or 1 ( pest-parser/pest#476 ); still waiting on a response. It seems the pest community is pretty dead lately.

I also tried looking at an even simpler example similar to one listed here: import statement missing a semicolon.

For example let's look at the syntax for import:

import = _{
      "import" ~ string_lit ~ ";"
}

If the user forgets a semicolon here, it will give the following error:

Error: FuTIL Parser:  --> 1:27
  |
1 | import "primitives/std.lib"␊
  |                           ^---
  |
  = expected char

Ideally, we would want to do something along the lines of expected semicolon or expected ;. I tried simply adding a new syntax, in the following manner:

semicolon = { ";" } // If this is silent, it simply returns the bad error message.
import = { "import" ~ string_lit ~ semicolon }

There's a few problems I'm running into with this.

  1. This will now return an error of expected semicolon, but on the next available (non-whitespace) line! I'm not sure why. Example:
Error: FuTIL Parser:  --> 7:1
  |
7 | component main() -> () {␊
  | ^---
  |
  = expected semicolon

  1. If I now add a semicolon, I get an error since I need to have a corresponding parsing method:
thread 'main' panicked at 'Rule `semicolon` does not have a corresponding parsing method', calyx/src/frontend/syntax.rs:83:1
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

However, I'm not sure what this method should do. I'd like it to just consume the semicolon and be done with it. I thought this would be simply as returning Ok(()), but this causes the entire import statement to be mis-read:

// In syntax.rs
fn semicolon(input: Node) -> ParseResult<()> {
    Ok(())
}

Resulting error:

Error: FuTIL Parser:  --> 1:1
  |
1 | import "primitives/std.lib";␊
  | ^--------------------------^
  |
  = Nodes didn't match any pattern: [string_lit, semicolon]


One hacky solution I was thinking about is simply allowing a binary number to take any digit, and catching the error later down the stack when we actually try to parse the binary number. I'm not exactly sure if this would be possible / how easy this would be, but its something I could try with your guys consent.
It looks like https://github.com/cucapra/futil/blob/f60942c74859cf98b285142e3023648398cf3c32/calyx/src/frontend/syntax.rs#L123 is where the error would pop up. It would return a ParseIntError according to the Rust documentation, so from there we'd need to convert it to an Error rule, at least according to how that function works. That seems like a code smell, so maybe there's a better way to do this.

@sampsyo
Copy link
Contributor

sampsyo commented Oct 4, 2020

Hey, thanks for looking into this! In general, getting good error messages out of a parser generator is quite hard—even when the parser generator makes good error reporting a priority. If the library doesn't support special features to offer control over error messages, often the only recourse is to just try fiddling around with equivalent ways of writing the same grammar (as you have here) to see if the error messages magically get better. Or to just switch to another parser generator!

Anyway, I think it's worth fiddling around here, but I just wanted to put a warning here that it may be infeasible to make errors better without a more drastic step (in which case maybe it's not worth it for now).

@cgyurgyik
Copy link
Collaborator

Thanks Adrian. I agree. Part of the next release of pest (v3) was allowing for error messages to be part of the parse generation stage. This is specifically discussed in pest-parser/pest#352 which links to pest-parser/pest#333 , a discussion on v3. The last traffic in that issue was in May, and it seems progression has slowed down to a standstill. Like I mentioned above, I could try and put together a hacky solution, but I'm not sure if its worth the code maintenance degradation.

I am happy to hear other solutions though, since I'm still quite new to Rust and pest. Or, if switching to another parse generator is something of interest, I could look into that as well.

@sgpthomas
Copy link
Collaborator

yeah, it might be worthwhile to switch to a different parser. as much of a pain as that would be. A parser combinator library may be the way to go for this because they are generally better for error messages. I'm familiar with nom and I like it but it may be worthwhile to look into some other options. A thing that I would like to retain is being able to continue carrying span information around with identifiers so that we can have location specific error messages further along in the compiler pipeline.

I should add that you shouldn't feel obligated to rewrite the parser. It's a rather tedious task, although it might give you some more rust experience.

@cgyurgyik
Copy link
Collaborator

Ok we can bring this up in the next meeting. I wouldn't be opposed to the idea. Like you've mentioned, it would be a good entryway into learning Rust.

@rachitnigam
Copy link
Contributor Author

@cgyurgyik Thanks for all the discussion! While better parsing error message are going to be great, it seems like switching out the parser will require some planning. In the meantime, if you want to jump into the deep end and quickly learn Rust and our pass infrastructure, I instead recommend writing a pass. For example, #177 would be good, self-contained pass that's worth implementing.

@cgyurgyik
Copy link
Collaborator

Ok yeah I wasn't sure where it was on the list of priorities, and I'm sure its a pretty big undertaking. I'll look into #177.

@rachitnigam
Copy link
Contributor Author

The general strategy to fixing these issues was adding additional parse rules that parse the bad case and then throw an informative error message.

@sampsyo
Copy link
Contributor

sampsyo commented Dec 9, 2020

Awesome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Paper cut Spurious or confusing errors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants