• RDL – (Semi) static type check • contracts.ruby – Only dynamic check of arguments/return values • dry-types – Only dynamic checks of typed structs • RubyTypeInference (by JetBrains) – Type information extractor by dynamic analysis • Sorbet (by Stripe)
– Jeff Foster at Univ. of Maryland – Accepted in OOPSLA, PLDI, and POPL! • The gem is available – https://github.com/plum-umd/rdl • We evaluated RDL – thought writing type annotations for OptCarrot
NES # activate type annotations for RDL extend RDL::Annotate # type annotation before method definition type "(?Array<String>) -> self", typecheck: :call def initialize(conf = ARGV) ...
:call: when this method is called – :now: when this method is defined – :XXX: when "RDL.do_typecheck :XXX" is done – nil: no "static check" is done • Used to type-check code that uses the method • Still "run-time check" is done type "(?Array<String>) -> self", typecheck: :call def initialize(conf = ARGV) ...
instance variables class NES # activate type annotations for RDL extend RDL::Annotate var_type :@cpu, "%any" type "() -> %any", typecheck: :call def reset @cpu.reset #=> receiver type %any not supported yet ...
instance variables class NES # activate type annotations for RDL extend RDL::Annotate var_type :@cpu, "Optcarrot::CPU" type "() -> %any", typecheck: :call def reset @cpu.reset # error: no type information for # instance method `Optcarrot::CPU#reset'
NES # activate type annotations for RDL extend RDL::Annotate type "Optcarrot::CPU","reset","()->%any" var_type :@cpu, "Optcarrot::CPU" type "() -> %any", typecheck: :call def reset @cpu.reset ...
check is done • %bot means "no-return" – Always raises exception, process exit, etc. – But this method returns [65533] – In short, this is my bug in the annotation type "() -> %bot", typecheck: nil def reset ... @rom.load_battery #=> [65533] end
return meaningless value – No intention to allow users to use the value • What type should we use in this case? – %any, or return nil explicitly? • We need a "void" type – %any for the method; it can return anything – "don't use" for users of the method def reset LIBRARY_INTERNAL_ARRAY. each { … } end
can be also used to make type annotation automatically • I like this feature, but matz doesn't – He wants to avoid type annotations embedded in the code – He likes separated, non-Ruby type definition language (as Steep) pre(:belongs_to) do |name| …… type name, "() -> #{klass}" end
configurable • It checks the method body – Not only dynamic check of arguments/return values • The implementation is mature – Many features actually works, great! • Need type annotations • Supports meta-programming
end class User < Dry::Struct attribute :name, Types::String attribute :age, Types::Integer end • Can define structs with typed fields – Run-time type check – "type_struct" gem is similar
Run test suites under monitoring of TracePoint API – Hooks method call/return events, logs the passed values, and aggregate them to type information – Used by RubyMine IDE
check Method body Separated (mainly) RDL Semi-static type check Method body Embedded in code contracts. ruby Dynamic type check Arguments and return values Embedded in code dry-types Typed structs Only Dry::Struct classes Embedded in code RubyType Inference Extract type information Arguments and return values N/A
meta-programming like attr_* is difficult to support – Users will try to generate it programmatically • We may want to keep code position – To show lineno of code in type error report – Hard to manually keep the correspondence between type definition and code position in .rbi file – We may also want to keep other information
Ruby code – Alternative "RubyTypeInference" • Is not a type inference – Type inference of Ruby is hopeless – Conservative static type inference can extracts little information • Type profiler "guesses" type information – It may extract wrong type information – Assumes that user checks the result
Static type profiling cannot handle ActiveRecord – Dynamic type profiling cannot extract syntactic features (like void type) • We need a variety of type profilers – For ActiveRecord by reading DB schema – Extracting from RDoc/YARD
– FooBar={foo,bar} • Gather method calls for each parameters – x={foo,bar} – Remove general methods (like #[] and #+) to reduce false positive – Arity, parameter and return types aren't used • Assign a class that all methods match class FooBar def foo(...);...;end def bar(...);...;end end def func(x) x.foo(1) x.bar(2) end
sample code that has many user- defined classes • Manually checked the guessed result – Found some common guessing failures • Wrong result / no-match result – No quantitative evaluation yet
a mapping from parameter name to type after profiling • "req" HTTPRequest – Revise guessed types using the mapping • Fixed! DefaultFileHandler#do_GET(req:HTTPRequest, res:HTTPResponse) FileHandler#do_GET(req:HTTPRequest, res:HTTPResponse) AbstractServlet#do_GET(req:HTTPRequest, res:HTTPResponse) ProcHandler#do_GET(request:#{}, response:#{}) ERBHandler#do_GET(req:HTTPRequest, res:HTTPResponse) CGIHandler#do_GET(req:HTTPRequest, res:HTTPResponse)
guess in only limited cases – Returns formal parameter – Returns a literal or "Foo.new" – Returns an expression which is already included Type DB • See actual usage of the method? – Requires inter-procedural or whole-program analysis!
matches: • Num#<<(Num) Num • Str#<<(Num) Str • Array[α]#<<(α) Array[α] – Just take union types of them • (Overloaded types might be better) def push_42(x) x << 42 end #=> (x:(Num|Str|Array))=>(Num|Str|Array) x << 42
sample code that uses many builtin types • Manually checked the guessed result – Found some common guessing failures • Wrong result / no-match result – No quantitative evaluation yet
– Can guess void type – Can guess parameters that is not used as a receiver • Cons – Cause wrong guessing – Hard to handle type parameters (Array[α]) – Hard to scale • The bigger type DB is, more wrong results will happen
TracePoint API – The same as RubyTypeInference • Additional features – Support block types • Required enhancement of TracePoint API – Support container types: Array[Int] • By sampling elements
Recording OptCarrot may take hours – Element-sampling for Array made it faster, but still take a few minutes • Without tracing, it runs in a few seconds – It may depend on application • Profiling WEBrick is not so slow
methods returns garbage – DA cannot distinguish garbage and intended return value • SA can guess void type by heuristic – Integer#times, Array#each, etc. – if statement that has no "else" – while and until statements – Multiple assignment • (Steep scaffold now supports some of them)
– It can profile any programs • Including meta-programming like ActiveRecord • Cons – Need to run tests; it might be very slow – Hard to handle void type – TracePoint API is not enough yet – Need to cooperate with test frameworks
implementations are available • Type DB: Ruby3's key concept • Some prototypes and experiments of type profilers – Need more improvements / experiments!