Saturday, November 19, 2022

[FIXED] How to transform a variable to bucketed variable which tells us which bucket/range it lies to in pytorch

November 19, 2022 numpy, pytorch, regression No comments

Issue

I have a variable a = [0.129, 0.369, 0.758, 0.012, 0.925]. I want to transform this variable into a bucketed variable. What I mean by this is explained below.

min_bucket_value, max_bucket_value = 0, 1 (Can be anything, for example, 0 to 800, but the min value is always going to be 0)

num_divisions = 10 (For this example I've taken 10, but it can be higher as well, like 80 divisions instead of 10)

Bucket/division ranges are as shown below.

0 - 0.1 -> 0
0.1 - 0.2 -> 1
0.2 - 0.3 -> 2
0.3 - 0.4 -> 3
0.4 - 0.5 -> 4
0.5 - 0.6 -> 5
0.6 - 0.7 -> 6
0.7 - 0.8 -> 7
0.8 - 0.9 -> 8
0.9 - 1.0 -> 9

so, transformed_a = [1, 3, 7, 0, 9]

So it's like I divide min_bucket_value, max_bucket_value in num_divisions different ranges/buckets and then transform original a to tell which bucket it lies in

I've tried creating torch.linspace(min_bucket_value, max_bucket_value, num_divisions), but not sure how to move forward and map it to a range so that I can get the bucket index to which it belongs to

Can you guys please help

EDIT

There's an extension to this problem.

Let's say that we've got a = [127, 362, 799] and I want to create two buckets. One is a coarse bucket, so a_transform = [12, 36, 89], but what if I want a fine bucket as well so that my second transformation becomes a_fine_transform = [7, 2, 9].

Sub-range index within the range. Basically, coarse division has 80 buckets (giving 127 in 12th bucket) and then the fine bucket which has 10 divisions which tells us that 127 lies in 12th coarse bucket and 7th fine bucket

a can be in float as well. eg, a = [127.36, 362.456, 789.646].

so a_coarse_transform = [12, 36, 78] & a_fine_transform = [7, 2, 6]

where min_bucket_value, max_bucket_value, num_coarse_divisions, num_fine_divisions = 0, 1, 80, 10

Solution

For torch tensor, you can simply use the following code (part of the answer from Bob and partly different to work it will tensor since numpy won't work unless and until you call .cpu() method on the tensor, which I'm not sure is the right thing to do)

So instead to this,

a1 = (a - min_bucket_value) / (max_bucket_value - min_bucket_value)

a_coarse_transform, r = ((a1 * coarse_divisions)//1).type(torch.long), (a1 * coarse_divisions)%1
a_fine_transform, r = ((r * fine_divisions)//1).type(torch.long), (r * fine_divisions)%1

Answered By - hellblazer

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, November 19, 2022

[FIXED] How to transform a variable to bucketed variable which tells us which bucket/range it lies to in pytorch

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels