ShadowAether

joined 2 years ago
MODERATOR OF
[–] ShadowAether 3 points 2 years ago

Oh I totally forgot, I have ecosia on my phone, it's good too

[–] ShadowAether 1 points 2 years ago

This is unrelated and we are going to face a similar decision to beehaw's soon as well (or we already are, considering I saw at least one post about it)

[–] ShadowAether 3 points 2 years ago (2 children)

One of my friends is a total chatgpt convert but I personally have found it hard to get good responses out of it

[–] ShadowAether 1 points 2 years ago
[–] ShadowAether 2 points 2 years ago

I used google a lot, I've also found connectedpapers.com is a fun tool (but the freemium part is annoying)

[–] ShadowAether 2 points 2 years ago

NN libraries like Pytorch and Tensorflow can repurposed to be used for GPU-accelerated operations like convolution by setting up specific networks with preset weights. Another option can be to use FFT or wavelet libraries but for applications with many samples, this can be a good option.

[–] ShadowAether 2 points 2 years ago

Original answer: (credit @Mercury)

Late answer, but worth posting for reference. Quoting from comments of the OP:

Each row in A is being filtered by the corresponding row in B. I could implement it like that, just thought there might be a faster way.

A is on the order of 10s of gigabytes in size and I use overlap-add.

Naive / Straightforward Approach

import numpy as np
import scipy.signal as sg

M, N, P = 4, 10, 20
A = np.random.randn(M, N) # (4, 10)
B = np.random.randn(M, P) # (4, 20)

C = np.vstack([sg.convolve(a, b, 'full') for a, b in zip(A, B)])

>>> C.shape
(4, 29)

Each row in A is convolved with each respective row in B, essentially convolving M 1D arrays/vectors.

No Loop + CUDA Supported Version It is possible to replicate this operation by using PyTorch's F.conv1d. We have to imagine A as a 4-channel, 1D signal of length 10. We wish to convolve each channel in A with a specific kernel of length 20. This is a special case called a depthwise convolution, often used in deep learning.

Note that torch's conv is implemented as cross-correlation, so we need to flip B in advance to do actual convolution.

import torch
import torch.nn.functional as F

@torch.no_grad()
def torch_conv(A, B):
    M, N, P = A.shape[0], A.shape[1], B.shape[1]
    C = F.conv1d(A, B[:, None, :], bias=None, stride=1, groups=M, padding=N+(P-1)//2)
    return C.numpy()

# Convert A and B to torch tensors + flip B

X = torch.from_numpy(A) # (4, 10)
W = torch.from_numpy(np.fliplr(B).copy()) # (4, 20)

# Do grouped conv and get np array
Y = torch_conv(X, W)

>>> Y.shape
(4, 29)

>>> np.allclose(C, Y)
True

Advantages of using a depthwise convolution with torch:

No loops! The above solution can also run on CUDA/GPU, which can really speed things up if A and B are very large matrices. (From OP's comment, this seems to be the case: A is 10GB in size.) Disadvantages:

Overhead of converting from array to tensor (should be negligible) Need to flip B once before the operation>>

[–] ShadowAether 4 points 2 years ago

Good: I got support from people when things in my DnD group got weird.

Bad: Once, I asked a technical question that I had asked people irl and researched a lot and not found what I was looking for. On reddit, I had people making assumptions and nitpicking the terminology while avoiding the actual question completely. It was a good example of the CS/math departments friction (which makes a whole lot more sense to me now). I did get a better answer on another site by just posting the equation and using zero jargon but I ended up abandoning that topic bc it was impractical.

[–] ShadowAether 2 points 2 years ago

"people amazed they can use an LLM trained with webpages as search engine" again

[–] ShadowAether 3 points 2 years ago

Great game, I loved the story and the dlc was good too

[–] ShadowAether 1 points 2 years ago

Burrows is a far better name for instances tho

 

Not OP. This question is being reposted to preserve technical content removed from elsewhere. Feel free to add your own answers/discussion.

Original question: When training a model for image classification it is common to use pooling layers to reduce the dimensionality, as we only care about the final node values corresponding to the categorical probabilities. In the realm of VAEs on the other hand, where we are attempting to reduce the dimensionality and subsequently increase it again, I have rarely seen pooling layers being used. Is it normal to use pooling layers in VAEs? If not, whats the intuition here? Is it because of their injective nature?

 

Or snack prep. Or if you make/use freezer packs (prepped raw/half cooked ingredients that can just be thrown in a pot) which I haven't tried but it seems interesting.

 

Icebreaker post! What's something related to ML that you are in the process of learning more about or just learned?

 

cross-posted from: https://sh.itjust.works/post/48227

Presented on Wednesday, June 21 at 12:00 PM ET/16:00 UTC by Daniel Zingaro, Associate Teaching Professor at the University of Toronto, and Leo Porter, Associate Professor of Computer Science and Engineering at UC San Diego. Michelle Craig, Professor of Computer Science at the University of Toronto and member of the ACM Education Board, will moderate the questions and answers session following the talk.

 

Have a cool new project idea that involves machine learning but not sure where to start? Years of experience but your new model is just not working out? Ask stupid questions (or hard ones) about machine learning here! Join us at [email protected]

How are we different from [email protected]? This community is more focused on helping others with the math, programming and implementation behind machine learning applications (more general than just AI). Looking at the cool stuff you can do with AI is always fun but it's good to have a place to talk about the problems and challenges.

 

Have a community that you want to recommend or promote on here? Just made a community and want to let everyone know? Write it in the comments!

Veux recommander une communauté? Où annoncer une communauté tu as créé? Partagez-le ici!

Remember to link non-instance communities like so it will take users to the right place without taking them off their instance:

[/c/[email protected]](/c/[email protected])

/c/[email protected]

9
submitted 2 years ago* (last edited 2 years ago) by ShadowAether to c/learnmachinelearning
 

Overfitting and underfitting are often shown as progress during stages of training (not enough training-underfit, too much training-overfit) or as a function of model complexity (not enough complexity-underfit, too much complexity-overfit). Like this image, it seems to suggest that underfitting and overfitting can't happen at the same time but in theory, shouldn't it be possible for a model to be both at the same time?

 

Presented on Wednesday, June 21 at 12:00 PM ET/16:00 UTC by Daniel Zingaro, Associate Teaching Professor at the University of Toronto, and Leo Porter, Associate Professor of Computer Science and Engineering at UC San Diego. Michelle Craig, Professor of Computer Science at the University of Toronto and member of the ACM Education Board, will moderate the questions and answers session following the talk.

 

There are a lot of each out there, what's your preference and why?

 

Let's assume the students have a base knowledge of linear algebra/calculus, what the key topics from statistics that should be covered in an introductory course?

view more: ‹ prev next ›