Guarantees on computing $a+x(b-a)$ in floating point

I want to implement the function $ f(x,a,b) = a + x(b-a)$ where all the inputs are floating point (doubles, say), such that (a) $ f(0,a,b)=a$ exactly; (b) $ f(1,a,b)=b$ exactly; (c) $ f(x,a,b) \le f(y,a,b)$ whenever $ x \le y$ ; and preferably (d) it is accurate (correct up to rounding).

Implementing $ f(x,a,b)=a+x(b-a)$ directly does not work because for example $ f(1.0,-1.0,\operatorname{prev}(1.0)) = 1.0$ (where $ \operatorname{prev}(a)$ is the floating point number before $ a$ ). And $ f(x,a,b)=b-(1-x)(b-a)$ has the same issue.

Now $ $ f(x,a,b) = (1-x)(a+x(b-a))+x(b-(1-x)(b-a))$ $ has the first two properties.

  • Does it have property (c)?
  • How accurate is it?
  • Is there a more performant way to do this?

How does normalised floating point binary work with two’s complement?

I’m doing AQA a-level computer science, and the specification for which states that:

Exam questions on floating point numbers will use a format in which both the normalised mantissa and exponent are represented using two’s complement

(despite this not being the IEEE standard). I can’t find much information online about how a system would work with a mantissa that is both normalised and in two’s complement. This is because I would guess that the mantissa has to represent a value between -1 and 1; however if we do this, then the same numbers can be expressed in multiple ways so I would not consider it normalised, for example:

1.0110 * 2^(3) = (-1 + 1/4 + 1/8) * 2^(3) = -5 

and

1.1011 * 2^(4) = (-1 + 1/2 + 1/8 + 1/16) * 2^(4) = -5 

From the A-level Paper 2 June 2017 question 11, it seems that

1 . 0 0 0 0 0 0 0   |   0 0 1 0 

is considered a negative normalised value but

1 . 1 0 0 1 1 1 0   |   1 0 0 0 

isn’t. Any enlightenment would be much appreciated.

algorithm for correctly rounded floating point radix conversion

Is there any generic algorithm which implements a floating point radix conversion?

Lets say we have a $ p$ -digit FP number

$ A = \sum_{i=0}^{p-1} A_i \beta^{e-i}$

in radix $ \beta$ and with $ 0 \leq A_i < \beta$ .

How do we find the $ A’_i$ , $ e’$ values for the $ p’$ -digit base $ \gamma$ FP number

$ A’ = \sum_{i=0}^{p’-1} A’_i \gamma^{e’-i}$

closest to $ A$ ?

There is one question which explicitly asks about radix 2 to radix 10 conversion, but unfortunately the answers seem to be specific for these radix combination. Here I ask about the general case.

Also is an intermediate arbitrary precision FP calculation really necessary? (as in the function strtod in David Gay’s dtoa.c)

How to restrict floating object in domain object created by open api 3.0 [on hold]

I am trying to create domain object with meta which can have different required params based on the actions for example

{  "action": "tap", "title":"choose option" "postback": "swagger" }  {  "action": "openPhonebook", "title":"choose contact" "meta":{ "sessionId":"ytyut" }  {  "action": "openKeyPad": "meta":{    "sessionId":"678768",     "partialMessage":"type your input.."   } } 

Inorder to make sure the params in meta is required as per action defined above i followed the this docs https://swagger.io/docs/specification/data-models/inheritance-and-polymorphism/ and created domain as follows. is there any alternate way?

meta:       type: object       properties:         partialMessage:           type: string           example: Change operator to          sessionId:           type: string           example: 6567678872937 actionType:       type: string       enum: ['tap', 'openPhonebook', 'openKeypad']       example: openPhonebook # base action object actionObject:       type: object       required:       - action       properties:         action:           $  ref: '#/components/schemas/actionType'         title:           type: string         postback:           type: string         meta:           $  ref: '#/components/schemas/meta'       additionalProperties: false  # action specific  tap:       allOf:         - $  ref: '#/components/schemas/actionObject'         - type: object           required:           - postback           - title           properties:             action:               type: string               enum: ['tap']     openPhonebook:       allOf:         - $  ref: '#/components/schemas/actionObject'         - type: object           required:           - title           properties:             action:               type: string               enum: ['openPhonebook']              meta:               allOf:               - $  ref: '#/components/schemas/meta'               - type: object                 required:                 - sessionId     openKeypad:       allOf:         - $  ref: '#/components/schemas/actionTemplate'         - type: object           required:           - title           properties:             action:               type: string               enum: ['openKeypad']             meta:               allOf:               - $  ref: '#/components/schemas/meta'               - type: object                 required:                 - partialMessage 

Is it possible to define constant value which can be reference inside enum like instead of hardcoding single enum i want to make reference to a constant which may be changed but it will not affect schema like as follows

enumTap:       type: string       value: tap tap:       allOf:         - $  ref: '#/components/schemas/actionObject'         - type: object           required:           - postback           - title           properties:             action:               type: string               enum:                - $  ref: #/Component/enumTap 

Tcl could not save floating point numbers in binary format

I am trying to save a list of numbers in binary format (floating point single) but Tcl cant save it correctly and I could not gain to correct number when i read the file from vb.net

set outfile6 [open "btest2.txt" w+] fconfigure stdout  -translation binary -encoding binary set aa {} set p 0 for {set i 1} {$  i <= 1000 } {incr i} {   lappend aa [expr (1000.0/$  i )]   puts -nonewline $  outfile6 [binary format "f" lindex $  aa $  p]]   incr p } close $  outfile6