isabelle


Parsing the content of a cartouche using a “term parser”


I want to implement a domain specific language (with its own parser) inside cartouches in Isabelle. For example, I would like the term (MY ‹123›, 3) to invoke my own parser for the substring 123, but to parse the rest normally as terms.
Following HOL/ex/Cartouche_Examples.thy, I understand how to install my own parse translation for subterms of the form MY ‹...›, and how to get the content of the cartouche as either string*Position.T or as Symbol_Pos.T list.
I also understand how to use Isabelle's parser combinators to write a parser of type term parser.
But I cannot find out how to apply the parser to a string (or a Symbol_Pos.T list).
In other words, what I am still lacking is a function
fun parse_cartouche ctx (cartouche:string) (pos:Position.T) : term = ???
that applies my parser of type term parser to the string cartouche (and correctly reports parse errors to the top level).
To clarify:
I want to make use of the existing infrastructure of Isabelle for tracking/reporting parsing locations. For example, if there is a parse error, I expect the code to be red in Isabelle/jEdit, and if inside my own language, I would call a parser like Args.parse_term, I would expect Isabelle/jEdit to color variable correct, and to get type information by control-hover.
I prefer not to reimplement my own parsers for common things like int's etc., but can do so if I get at least the previous bullet point. (However, parsing a substring of my language as a term, I would have to some existing parsing function, since I cannot reimplement the Isabelle syntax on my own.
Below is my complete code so far (with a dummy implementation of parse_cartouche).
theory Scratch
imports Main
begin
ML {*
(* In reality, this would of course be a much more complex parser. *)
val my_parser : term parser = Parse.nat >> HOLogic.mk_nat
(* This function should invoke my_parser to parse the content of cartouche.
Parse errors should be properly reported (i.e., with red highlighting in
jEdit etc. *)
fun parse_cartouche ctx (cartouche:string) (pos:Position.T) : term =
(warning ("I should parse: " ^ cartouche ^ ". Returning arbitrary term instead"); #{term True})
(* Modified from Cartouche_Examples.thy *)
fun cartouche_tr (ctx:Proof.context) args =
let fun err () = raise TERM ("cartouche_tr", args) in
(case args of
[(c as Const (#{syntax_const "_constrain"}, _)) $ Free (s, _) $ p] =>
(case Term_Position.decode_position p of
SOME (pos, _) => c $ (parse_cartouche ctx s pos) $ p
| NONE => err ())
| _ => err ())
end;
*}
syntax "_my_syntax" :: "cartouche_position ⇒ 'a" ("MY_")
parse_translation ‹[(#{syntax_const "_my_syntax"}, cartouche_tr)]›
term "(MY ‹123›, 3)" (* Should parse as (123,3) *)
end
Because this is a relatively rare use case, I'm not sure if a "canonical" solution for this has emerged yet. But I can at least give you two examples from my own code which should help illustrate the general approach.
Evaluation of ML code in terms
source
The following parse translation, given a function eval_term : string -> term, extracts some ML source from a cartouche, evaluates it to a term, which is then used as the result of the parse translation.
fun term_translation ctxt args =
let
fun err () = raise TERM ("Splice.term_translation", args)
fun input s pos =
let
val content = Symbol_Pos.cartouche_content (Symbol_Pos.explode (s, pos))
val (text, range) = Symbol_Pos.implode_range (Symbol_Pos.range content) content
in
Input.source true text range
end
in
case args of
[(c as Const (#{syntax_const "_constrain"}, _)) $ Free (s, _) $ p] =>
(case Term_Position.decode_position p of
SOME (pos, _) => c $ eval_term (input s pos) ctxt $ p
| NONE => err ())
| _ => err ()
end
Embedding XML
source
This one allows me to embed XML literals into terms which will then be interpreted as terms.
syntax "_cartouche_xml" :: "cartouche_position \<Rightarrow> 'a" ("XML _")
parse_translation\<open>
let
fun translation args =
let
fun err () = raise TERM ("Common._cartouche_xml", args)
fun input s pos = Symbol_Pos.implode (Symbol_Pos.cartouche_content (Symbol_Pos.explode (s, pos)))
val eval = Codec.the_decode Codec.term o XML.parse
in
case args of
[(c as Const (#{syntax_const "_constrain"}, _)) $ Free (s, _) $ p] =>
(case Term_Position.decode_position p of
SOME (pos, _) => c $ eval (input s pos) $ p
| NONE => err ())
| _ => err ()
end
in
[(#{syntax_const "_cartouche_xml"}, K translation)]
end
\<close>
Update
The following code should allow you to turn an Input.source into something digestible for the parser combinators, including full position information:
ML ‹
val input = ‹term"3 + 4"›;
(* a bit more complicated than just Input.pos_of because otherwise the position includes the
outer cartouche brackets, which manifests as an off-by-one-error in the markup *)
val pos = Input.source_explode input |> Symbol_Pos.range |> Position.range_position;
val str = Input.source_content input;
val toks = Token.explode Keyword.empty_keywords pos str;
val parser = Args.$$$ "term" |-- Args.embedded_inner_syntax;
parser toks |> fst |> Syntax.read_term #{context}
›
Based on #larsrh's answer and own experimentation, I came up with the following answer.
The parse translation gets the cartouche content as a string, together with a position. There can be converted into a Symbol_Pos.T list using Symbol_Pos.cartouche_content o Symbol_Pos.explode. (This is covered in the examples in Cartouche_Examples.thy, contributed with Isabelle.)
The Symbol_Pos.T list can be converted into Source.source containing Symbol_Pos.Ts using Source.of_list.
The Source.source containing containing Symbol_Pos.Ts can be converted into a Source.source containing containing Token.Ts using Token.source'.
We remove whitespace tokens from this source using Token.source_proper.
And the result is converted to a Token.T list using Source.exhaust.
Finally, parsers of type 'a parser can be applied to such a Token.T list. (Or, if we have an 'a context_parser, then we need to additionally supply a context.)
Some additional work needs to be done: add an EOF to the Token.T list to allow parsers to detect the end of the input. Handle errors in the parser (to get nice error messages).
The code below is a complete commented working example (for Isabelle 2016-1), the source can also be found here.
theory Scratch
imports Main
begin
ML {*
(* test_parser is just a definition of a silly example parser. It parses text of the form "123 * ‹x+y›"
where 123 is an arbitrary natural, and x+y a term. test_parser is of type term context_parser.
The parser returns a term that is a list 123 copies of x+y.
If you have constructed a "term parser" instead, you can either convert it using Scan.lift,
or modify the definition of parse_cartouche below slightly.
*)
fun sym_parser sym = Parse.sym_ident :-- (fn s => if s=sym then Scan.succeed () else Scan.fail) >> #1;
val test_parser = Scan.lift Parse.nat --| Scan.lift (sym_parser "*" || Parse.reserved "x") -- Args.term
>> (fn (n,t) => replicate n t |> HOLogic.mk_list dummyT)
(* parse_cartouche: This function takes the cartouche that should be parsed (as a plain string
without markup), together with its position. (All this information can be extracted using the
information available to a parse translation, see cartouch_tr below.) *)
fun parse_cartouche ctx (cartouche:string) (pos:Position.T) : term =
let
(* This extracts the content of the cartouche as a "Symbol_Pos.T list".
One posibility to continue from here would be to write a parser that works
on "Symbol_Pos.T list". However, most of the predefined parsers expect
"Token.T list" (a single token may consist of several symbols, e.g., 123 is one token). *)
val content = Symbol_Pos.cartouche_content (Symbol_Pos.explode (cartouche, pos))
(* Translate content into a "Token.T list". *)
val toks = content |> Source.of_list (* Create a "Source.source" containing the symbols *)
|> Token.source' true Keyword.empty_keywords (* Translate into a "Source.source" containing tokens.
I don't know what the argument true does here. false also works, I think. *)
|> Token.source_proper (* Remove things like whitespaces *)
|> Source.exhaust (* Translate the source into a list of tokens *)
|> (fn src => src # [Token.eof]) (* Add an eof to the end of the token list, to enable Parse.eof below *)
(* A conversion function that produces error messages. The ignored argument here
contains the context and the list of remaining tokens, if needed for constructing
the message. *)
fun errmsg (_,SOME msg) = msg
| errmsg (_,NONE) = fn _ => "Syntax error"
(* Apply the parser "test_parser". We additionally combine it with Parse.eof to ensure that
the parser parses the whole text (till EOF). And we use Scan.!! to convert parsing failures
into parsing errors, and Scan.error to report parsing errors to the toplevel. *)
val (term,_) = Scan.error (Scan.!! errmsg (test_parser --| Scan.lift Parse.eof)) (Context.Proof ctx,toks)
(* If test_parser was of type "term parser" instead of "term context_parser", we would use instead:
val (term,_) = Scan.error (Scan.!! errmsg (test_parser --| Parse.eof)) toks *)
in term end
(* A parse translation that translates cartouches using test_parser. The code is very close to
the examples from Cartouche_Examples.thy. It takes a given cartouche-subterm, gets its
position, and calls parse_cartouche to do the translation to a term. *)
fun cartouche_tr (ctx:Proof.context) args =
let fun err () = raise TERM ("cartouche_tr", args) in
(case args of
[(c as Const (#{syntax_const "_constrain"}, _)) $ Free (s, _) $ p] =>
(case Term_Position.decode_position p of
SOME (pos, _) => c $ (parse_cartouche ctx s pos) $ p
| NONE => err ())
| _ => err ())
end;
*}
(* Define a syntax for calling our translation. In this case, the syntax is "MY ‹to-be-parsed›" *)
syntax "_my_syntax" :: "cartouche_position ⇒ 'a" ("MY_")
(* Binds our parse translation to that syntax. *)
parse_translation ‹[(#{syntax_const "_my_syntax"}, cartouche_tr)]›
term "(MY ‹3 * ‹b+c››, 2)" (* Should parse as ([b+c,b+c,b+c],2) *)
term "(MY ‹10 x ‹q››, 2)" (* Should parse as ([q, q, q, q, q, q, q, q, q, q], 2) *)
term "(MY ‹3 * ‹MY ‹3 * ‹b+c››››, 2)" (* Things can be nested! *)
end

Related Links

Manually adding an assumption to the simplifier (Isabelle)
Proving something is an instance of a locale in Isabelle
Defining multiple constants in an integral
Definition without recursion, by cases, in Isabelle
Using a definition to produce an specific example of a locale in Isabelle
Should I use universal quantification in lemma formulation?
Using syntax/translations wiith locales
Quotienting a mutually recursive family of datatypes
L2Norm with Integration
Error defining dataype in Isabelle
Defining a function which returns functions in Isabelle
Defining functions between constants in Isabelle
Using the ordering locale with partial maps
Avoiding assumption with sledgehammer
Discriminant with Inequalities
Instances in locale declaration for Isabelle

Categories

HOME
xamarin
yii2
image
bluetooth
hook
plone
office365api
spring-cloud-stream
portia
synchronization
spring-kafka
volttron
circuit
finite-automata
angular-ui
dbext
swiftlint
abi
jquery-ajaxq
cloudhub
tar
tapestry
lcd
emgucv
smb
one-to-many
intel-pin
filezilla
assistant
kendo-datasource
ioio
avcapturesession
repo
force-layout
scaffold
web-mining
automake
hue
dism
mesos-chronos
fedex
core-plot
quadratic-programming
mime
import-from-excel
angular2-meteor
apple-news
xml-documentation
media-player
windows-mobile-6.5
mediaelement
eventkit
watchconnectivity
passport-google-oauth
libpng
home-directory
google-web-starter-kit
prettytensor
captivenetwork
google-places
sdhc
dukescript
drawbitmap
php-ci
arcanist
system32
unity5.2.3
codeigniter-url
google-style-guide
java-metro-framework
project-planning
funcunit
p4java
centos5
intentservice
uv-mapping
ms-project-server-2010
plasma
simba
isnullorempty
gil
onsubmit
assembly-loading
subscript
gcj
blackberry-playbook
genshi
coercion
file-comparison
actionview
mysql-error-1005
mozilla-prism
fixed-width
exchange-server-2003
gacutil
post-build
lzh

Resources

Mobile Apps Dev
Database Users
javascript
java
csharp
php
android
MS Developer
developer works
python
ios
c
html
jquery
RDBMS discuss
Cloud Virtualization
Database Dev&Adm
javascript
java
csharp
php
python
android
jquery
ruby
ios
html
Mobile App
Mobile App
Mobile App