isabelle


Parsing the content of a cartouche using a “term parser”


I want to implement a domain specific language (with its own parser) inside cartouches in Isabelle. For example, I would like the term (MY ‹123›, 3) to invoke my own parser for the substring 123, but to parse the rest normally as terms.
Following HOL/ex/Cartouche_Examples.thy, I understand how to install my own parse translation for subterms of the form MY ‹...›, and how to get the content of the cartouche as either string*Position.T or as Symbol_Pos.T list.
I also understand how to use Isabelle's parser combinators to write a parser of type term parser.
But I cannot find out how to apply the parser to a string (or a Symbol_Pos.T list).
In other words, what I am still lacking is a function
fun parse_cartouche ctx (cartouche:string) (pos:Position.T) : term = ???
that applies my parser of type term parser to the string cartouche (and correctly reports parse errors to the top level).
To clarify:
I want to make use of the existing infrastructure of Isabelle for tracking/reporting parsing locations. For example, if there is a parse error, I expect the code to be red in Isabelle/jEdit, and if inside my own language, I would call a parser like Args.parse_term, I would expect Isabelle/jEdit to color variable correct, and to get type information by control-hover.
I prefer not to reimplement my own parsers for common things like int's etc., but can do so if I get at least the previous bullet point. (However, parsing a substring of my language as a term, I would have to some existing parsing function, since I cannot reimplement the Isabelle syntax on my own.
Below is my complete code so far (with a dummy implementation of parse_cartouche).
theory Scratch
imports Main
begin
ML {*
(* In reality, this would of course be a much more complex parser. *)
val my_parser : term parser = Parse.nat >> HOLogic.mk_nat
(* This function should invoke my_parser to parse the content of cartouche.
Parse errors should be properly reported (i.e., with red highlighting in
jEdit etc. *)
fun parse_cartouche ctx (cartouche:string) (pos:Position.T) : term =
(warning ("I should parse: " ^ cartouche ^ ". Returning arbitrary term instead"); #{term True})
(* Modified from Cartouche_Examples.thy *)
fun cartouche_tr (ctx:Proof.context) args =
let fun err () = raise TERM ("cartouche_tr", args) in
(case args of
[(c as Const (#{syntax_const "_constrain"}, _)) $ Free (s, _) $ p] =>
(case Term_Position.decode_position p of
SOME (pos, _) => c $ (parse_cartouche ctx s pos) $ p
| NONE => err ())
| _ => err ())
end;
*}
syntax "_my_syntax" :: "cartouche_position ⇒ 'a" ("MY_")
parse_translation ‹[(#{syntax_const "_my_syntax"}, cartouche_tr)]›
term "(MY ‹123›, 3)" (* Should parse as (123,3) *)
end
Because this is a relatively rare use case, I'm not sure if a "canonical" solution for this has emerged yet. But I can at least give you two examples from my own code which should help illustrate the general approach.
Evaluation of ML code in terms
source
The following parse translation, given a function eval_term : string -> term, extracts some ML source from a cartouche, evaluates it to a term, which is then used as the result of the parse translation.
fun term_translation ctxt args =
let
fun err () = raise TERM ("Splice.term_translation", args)
fun input s pos =
let
val content = Symbol_Pos.cartouche_content (Symbol_Pos.explode (s, pos))
val (text, range) = Symbol_Pos.implode_range (Symbol_Pos.range content) content
in
Input.source true text range
end
in
case args of
[(c as Const (#{syntax_const "_constrain"}, _)) $ Free (s, _) $ p] =>
(case Term_Position.decode_position p of
SOME (pos, _) => c $ eval_term (input s pos) ctxt $ p
| NONE => err ())
| _ => err ()
end
Embedding XML
source
This one allows me to embed XML literals into terms which will then be interpreted as terms.
syntax "_cartouche_xml" :: "cartouche_position \<Rightarrow> 'a" ("XML _")
parse_translation\<open>
let
fun translation args =
let
fun err () = raise TERM ("Common._cartouche_xml", args)
fun input s pos = Symbol_Pos.implode (Symbol_Pos.cartouche_content (Symbol_Pos.explode (s, pos)))
val eval = Codec.the_decode Codec.term o XML.parse
in
case args of
[(c as Const (#{syntax_const "_constrain"}, _)) $ Free (s, _) $ p] =>
(case Term_Position.decode_position p of
SOME (pos, _) => c $ eval (input s pos) $ p
| NONE => err ())
| _ => err ()
end
in
[(#{syntax_const "_cartouche_xml"}, K translation)]
end
\<close>
Update
The following code should allow you to turn an Input.source into something digestible for the parser combinators, including full position information:
ML ‹
val input = ‹term"3 + 4"›;
(* a bit more complicated than just Input.pos_of because otherwise the position includes the
outer cartouche brackets, which manifests as an off-by-one-error in the markup *)
val pos = Input.source_explode input |> Symbol_Pos.range |> Position.range_position;
val str = Input.source_content input;
val toks = Token.explode Keyword.empty_keywords pos str;
val parser = Args.$$$ "term" |-- Args.embedded_inner_syntax;
parser toks |> fst |> Syntax.read_term #{context}
›
Based on #larsrh's answer and own experimentation, I came up with the following answer.
The parse translation gets the cartouche content as a string, together with a position. There can be converted into a Symbol_Pos.T list using Symbol_Pos.cartouche_content o Symbol_Pos.explode. (This is covered in the examples in Cartouche_Examples.thy, contributed with Isabelle.)
The Symbol_Pos.T list can be converted into Source.source containing Symbol_Pos.Ts using Source.of_list.
The Source.source containing containing Symbol_Pos.Ts can be converted into a Source.source containing containing Token.Ts using Token.source'.
We remove whitespace tokens from this source using Token.source_proper.
And the result is converted to a Token.T list using Source.exhaust.
Finally, parsers of type 'a parser can be applied to such a Token.T list. (Or, if we have an 'a context_parser, then we need to additionally supply a context.)
Some additional work needs to be done: add an EOF to the Token.T list to allow parsers to detect the end of the input. Handle errors in the parser (to get nice error messages).
The code below is a complete commented working example (for Isabelle 2016-1), the source can also be found here.
theory Scratch
imports Main
begin
ML {*
(* test_parser is just a definition of a silly example parser. It parses text of the form "123 * ‹x+y›"
where 123 is an arbitrary natural, and x+y a term. test_parser is of type term context_parser.
The parser returns a term that is a list 123 copies of x+y.
If you have constructed a "term parser" instead, you can either convert it using Scan.lift,
or modify the definition of parse_cartouche below slightly.
*)
fun sym_parser sym = Parse.sym_ident :-- (fn s => if s=sym then Scan.succeed () else Scan.fail) >> #1;
val test_parser = Scan.lift Parse.nat --| Scan.lift (sym_parser "*" || Parse.reserved "x") -- Args.term
>> (fn (n,t) => replicate n t |> HOLogic.mk_list dummyT)
(* parse_cartouche: This function takes the cartouche that should be parsed (as a plain string
without markup), together with its position. (All this information can be extracted using the
information available to a parse translation, see cartouch_tr below.) *)
fun parse_cartouche ctx (cartouche:string) (pos:Position.T) : term =
let
(* This extracts the content of the cartouche as a "Symbol_Pos.T list".
One posibility to continue from here would be to write a parser that works
on "Symbol_Pos.T list". However, most of the predefined parsers expect
"Token.T list" (a single token may consist of several symbols, e.g., 123 is one token). *)
val content = Symbol_Pos.cartouche_content (Symbol_Pos.explode (cartouche, pos))
(* Translate content into a "Token.T list". *)
val toks = content |> Source.of_list (* Create a "Source.source" containing the symbols *)
|> Token.source' true Keyword.empty_keywords (* Translate into a "Source.source" containing tokens.
I don't know what the argument true does here. false also works, I think. *)
|> Token.source_proper (* Remove things like whitespaces *)
|> Source.exhaust (* Translate the source into a list of tokens *)
|> (fn src => src # [Token.eof]) (* Add an eof to the end of the token list, to enable Parse.eof below *)
(* A conversion function that produces error messages. The ignored argument here
contains the context and the list of remaining tokens, if needed for constructing
the message. *)
fun errmsg (_,SOME msg) = msg
| errmsg (_,NONE) = fn _ => "Syntax error"
(* Apply the parser "test_parser". We additionally combine it with Parse.eof to ensure that
the parser parses the whole text (till EOF). And we use Scan.!! to convert parsing failures
into parsing errors, and Scan.error to report parsing errors to the toplevel. *)
val (term,_) = Scan.error (Scan.!! errmsg (test_parser --| Scan.lift Parse.eof)) (Context.Proof ctx,toks)
(* If test_parser was of type "term parser" instead of "term context_parser", we would use instead:
val (term,_) = Scan.error (Scan.!! errmsg (test_parser --| Parse.eof)) toks *)
in term end
(* A parse translation that translates cartouches using test_parser. The code is very close to
the examples from Cartouche_Examples.thy. It takes a given cartouche-subterm, gets its
position, and calls parse_cartouche to do the translation to a term. *)
fun cartouche_tr (ctx:Proof.context) args =
let fun err () = raise TERM ("cartouche_tr", args) in
(case args of
[(c as Const (#{syntax_const "_constrain"}, _)) $ Free (s, _) $ p] =>
(case Term_Position.decode_position p of
SOME (pos, _) => c $ (parse_cartouche ctx s pos) $ p
| NONE => err ())
| _ => err ())
end;
*}
(* Define a syntax for calling our translation. In this case, the syntax is "MY ‹to-be-parsed›" *)
syntax "_my_syntax" :: "cartouche_position ⇒ 'a" ("MY_")
(* Binds our parse translation to that syntax. *)
parse_translation ‹[(#{syntax_const "_my_syntax"}, cartouche_tr)]›
term "(MY ‹3 * ‹b+c››, 2)" (* Should parse as ([b+c,b+c,b+c],2) *)
term "(MY ‹10 x ‹q››, 2)" (* Should parse as ([q, q, q, q, q, q, q, q, q, q], 2) *)
term "(MY ‹3 * ‹MY ‹3 * ‹b+c››››, 2)" (* Things can be nested! *)
end

Related Links

Find all Pairs definition
Simplify meta-universally quantified assumptions with equality
What is the syntax to use Map.thy
Collecting locally fixed parameters using Eisbach
Why does simp “fail to apply initial proof method” where blast succeeds with the same facts?
Proving a theorem about parser combinators
Manually adding an assumption to the simplifier (Isabelle)
Proving something is an instance of a locale in Isabelle
Defining multiple constants in an integral
Definition without recursion, by cases, in Isabelle
Using a definition to produce an specific example of a locale in Isabelle
Should I use universal quantification in lemma formulation?
Using syntax/translations wiith locales
Quotienting a mutually recursive family of datatypes
L2Norm with Integration
Error defining dataype in Isabelle

Categories

HOME
netsuite
nullpointerexception
appx
yarn
jxls
windows-server
enterprise-library-5
acquia
constraint-programming
android-youtube-api
slurm
metatrader4
tomcat6
offline
lombok
tostring
facebook-page
jsdoc
database-replication
lldb
cx-freeze
clickonce
arabic
pass-by-reference
language-agnostic
format-specifiers
h2db
buildbot
centos6.5
intel-pin
internet-explorer-9
greendao
fifo
spring-mybatis
jspm
preg-match
airconsole
jspresso
google-qpx-express-api
http-get
copying
cloud-code
environment-modules
sequential
html5-fullscreen
plsql-psp
cookiecutter-django
dism
mapzen
google-closure
nand2tetris
modelmapper
gpx
s
auto-update
smartcontracts
atomicity
integrity
epson
zendesk-app
gnome-shell-extensions
angular-cache
radtreelist
database-optimization
intrusion-detection
multi-level
etsy
react-native-listview
moveit
actionbardrawertoggle
yaws
suffix-tree
pagerank
browser-link
wso2cloud
iis-arr
generic-programming
oberon
facebook-graph-api-v2.4
oxwall
tablelayout
gui-test-framework
php-5.4
eol
didselectrowatindexpath
uv-mapping
pyhdf
runtime.exec
tws
dataadapter
quickdialog
runas
venn-diagram
enter
transactionscope
eclipse-templates
php-parser
jmock
sproutcore-2
user-friendly
nintendo-ds
kdbg
photoshop-cs4
javap
post-build

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App